0% found this document useful (0 votes)
3 views

Thang_MAS202_Chap02. Organizing and Visualizing variables (2.5-2.8)

Chapter 2 focuses on organizing and visualizing both categorical and numerical variables, emphasizing the importance of avoiding common errors in data presentation. It introduces various graphical displays such as scatter plots, time series plots, and multidimensional contingency tables, along with tools like Excel and JMP for data analysis. The chapter concludes with best practices for constructing effective visualizations to enhance data comprehension.

Uploaded by

tanndss180108
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Thang_MAS202_Chap02. Organizing and Visualizing variables (2.5-2.8)

Chapter 2 focuses on organizing and visualizing both categorical and numerical variables, emphasizing the importance of avoiding common errors in data presentation. It introduces various graphical displays such as scatter plots, time series plots, and multidimensional contingency tables, along with tools like Excel and JMP for data analysis. The chapter concludes with best practices for constructing effective visualizations to enhance data comprehension.

Uploaded by

tanndss180108
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Chapter 2

Organizing and
Visualizing Variables

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 1


Objectives
In this chapter you learn:
 How to organize and visualize categorical
variables.
 How to organize and visualize numerical
variables.
 How to summarize a mix of variables.
 How to avoid making common errors when
organizing and visualizing variables.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 2


Visualizing Two Numerical Variables
By Using Graphical Displays
DCOVA

Two Numerical
Variables

Scatter Time-
Plot Series
Plot

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 3


Visualizing Two Numerical
Variables: The Scatter Plot
DCOVA
 Scatter plots are used for numerical data consisting of paired
observations taken from two numerical variables.

 One variable’s values are displayed on the horizontal or X


axis and the other variable’s values are displayed on the
vertical or Y axis.

 Scatter plots are used to examine possible relationships


between two numerical variables.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 4


Scatter Plot Example
DCOVA

Volume Cost per


per day day
Cost per Day vs. Production Volume
23 125
250
26 140
200
29 146
Cost per Day

150
33 160
100
38 167
50
42 170
0
50 188
20 30 40 50 60 70
55 195
Volume per Day
60 200

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 5


Visualizing Two Numerical
Variables: The Time Series Plot
DCOVA
 A Time-Series Plot is used to study
patterns in the values of a numeric
variable over time.

 The Time-Series Plot:


 Numeric variable’s values are on the
vertical axis and the time period is on
the horizontal axis.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 6


Time Series Plot Example
DCOVA
Number of
Year Franchises
2009 43
2010 54
2011 60
2012 73
2013 82
2014 95 Number of Franchises, 1996-2004

2015 107 120


Number of Franchises
2016 99 100
80
2017 95 60
40
20
0
2008 2010 2012 2014 2016 2018
Year

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 7


Organizing A Mix Of Variables: The
Multidimensional Contingency Table
DCOVA
 A multidimensional contingency table is constructed by
tallying the responses of three or more categorical variables.
 Can be used to discover possible patterns and relationships in
multidimensional data that simpler tables and charts would fail to
make apparent.

 As a practical rule, tables should be limited to no more than


three or four variables.
 In typical use, these tables:
 Extend contingency tables to two or more row or column variables, or
 Replace the frequencies found in a contingency table with summary
information about a numeric variable.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 8


A Multidimensional Contingency Table
Tallies Responses Of Three or More
Categorical Variables
DCOVA

Two Dimensional Table Showing Three Dimensional Table


Fund Type and Risk Level for Showing Fund Type, Market
sample of 479 retirement funds. Cap, and Risk Level for the
sample of the 479 retirement
funds.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 9


Excel, Minitab, and JMP Can Be Used To
Create Multidimensional Contingency Tables
DCOVA
 In Excel creating a Pivot Table yields an interactive
display of this type.

 In JMP you can create a table that is also interactive.

 In Minitab you can create such a table but it is not


interactive.

 JMP and Minitab have many specialized statistical &


graphical procedures (not covered in this book) to
analyze & visualize multidimensional data.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 10


Drilling-Down On A Table Reveals
The Data The Table Summarizes
DCOVA
 Clicking a cell in an Excel table displays the
rows of data associated with that cell.

 Clicking a cell in a JMP table highlights those


the rows of data that are the source for that cell.

 Drill-down is perhaps the simplest form of data


discovery.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 11


Drill-Down Reveals The Data
Underlying A Higher-Level Summary
DCOVA
Results of drilling down to
the details about small value
funds with low risk revealing
the ten-year return ranges from
4.83% to 9.44%.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 12


Displays To Visualize A Mix Of Many
Variables
DCOVA
 Displays are more useful than a multidimensional
contingency table with many row and column
variables.
 The data (not just summary statistics) can be
shown for numerical variables.
 Multiple numerical variables can be presented in
one summarization.
 Visualizations can reveal patterns that can be
hard to see in tables.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 13


Colored Scatter Plots Visualize Both
Numerical Variables & Categorical Variable(s)
Observations: DCOVA
Large Market Capitalization Funds (red dots)
1. Relatively have best returns and lowest expense ratios.

2. Some have either low returns or high expense ratios or both.

JMP Colored
Scatter Plot

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 14


Bubble Charts Extend Scatter Plots
DCOVA
 Use the size of points (called bubbles) to
represent the value of an additional variable.

 In Excel and Minitab the additional variable


must be numerical.

 In JMP the variable can be either numerical or


categorical.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 15


An Excel PivotChart Visualizes Specific
Categories From A PivotTable Summary
DCOVA
Low Risk Small
Market Cap Funds
Have The Highest Mean
Return Among Low Risk
Funds.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 16


Treemaps Are Graphical Displays Of
Contingency Tables
DCOVA
Excel Treemap: JMP Treemap:
Size of tiles correspond to Size of tiles correspond to the value
the frequency in a cell. of the numeric variable Market Cap.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 17


Sparklines Are Compact Time-Series
Visualizations Of Numerical Variables
DCOVA

Movie
revenues
by week
per month

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 18


Filtering and Querying Data
DCOVA
 Two operations associated with preparing tabular
or visual summaries are Data Filtering and
Querying.
 Data filtering selects rows of data that match
criteria; specified value(s) for specific variable(s).
 Data Querying is similar but may not select all of
the columns of the matching rows.
 Excel, JMP, and Minitab all have various filtering &
query features.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 19


Example Of JMP & Minitab Filtering /
Querying
DCOVA
Selecting all rows in value retirement funds that
have ten-year return percentages that are greater
than or equal to 9.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 20


Excel Slicers Filter & Query Data
From A Pivot Table
DCOVA
 A slicer is a panel of clickable buttons
superimposed over a worksheet.

 Each button in a slicer represents a unique


value of a variable found in a the source data of
a PivotTable.

 By clicking buttons in the slicer panels, you


query the data.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 21


Example Of Slicers For The
Retirement Funds Workbook
DCOVA
With the four slicers below, you can ask questions such as:
1. What are the attributes of the fund(s) with the lowest expense ratio?
2. What are the expense ratios associated with large market cap value
funds that have a star rating of five?

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 22


Answering The Questions
What are the DCOVA
attributes of the What are the expense ratios
fund(s) with the associated with large market cap value
lowest expense ratio? funds that have a star rating of five?

The updated The expense ratios for these funds are:


PivotTable (not shown 0.83, 094, 1.05, 1.09, 1.18, and 1.19.
below) reveals only
one such fund.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 23


The Challenges in Organizing
and Visualizing Variables DCOVA

 When organizing and visualizing data need to


be mindful of:
 The limits of other’s ability to perceive and
comprehend.
 Presentation issues that can undercut the usefulness
of methods from this chapter.
 It is easy to create summaries that:
 Obscure the data or
 Create false impressions.
 Contain Chartjunk

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 24


An Example Of Obscuring Data,
Information Overload
DCOVA

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 25


False Impressions Can Be
Created In Many Ways DCOVA

 Selective summarization:
 Presenting only part of the data collected.

 Improperly constructed charts:


 Potential pie chart issues.
 Improperly scaled axes.
 A Y axis that does not begin at the origin or is a
broken axis missing intermediate values.

 Chartjunk.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 26
An Example of Selective Summarization, These
Two Summarizations Tell Totally Different Stories
DCOVA

Change
from
Prior
Company Year Company Year 1 Year 2 Year 3
A +7.2% A -22.6% -33.2% +7.2%
B +24.4% B -4.5% -41.9% +24.4%
C +24.9% C -18.5% -31.5% +24.9%
D +24.8% D -29.4% -48.1% +24.8%
E +12.5% E -1.9% -25.3% +12.5%
F +35.1% F -1.6% -37.8% +35.1%
G +29.7% G +7.4% -13.6% +29.7%

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 27


How Obvious Is It That Both Pie Charts
Summarize The Same Data? DCOVA

Why is it hard to tell? What would you do to improve?

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 28


Graphical Errors:
No Relative Basis DCOVA

Bad Presentation  Good Presentation


A’s received by A’s received by
Freq. students. % students.
30%
300

200 20%

100 10%

0 0%
FR SO JR SR FR SO JR SR

FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 29


Graphical Errors:
Compressing the Vertical Axis
DCOVA

Bad Presentation  Good Presentation


Quarterly Sales Quarterly Sales
$ $
200 50

100 25

0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 30


Graphical Errors: No Zero Point
on the Vertical Axis
DCOVA

Bad Presentation
 Good Presentations

Monthly Sales $ Monthly Sales


$ 45
45
42
42 39
39 36
36 0
J F M A M J J F M A M J

Graphing the first six months of sales

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 31


Graphical Errors: Chart Junk,
Can You Identify The Junk?
DCOVA

Bad Presentation  Good Presentation

Left illustration adapted from S. Watterson, “Liquid Gold—Australians Are Changing the World of Wine. Even the French Seem Grateful.” Time,
November 22, 1999, p. 68-69

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 32


Best Practices for Constructing
Visualizations DCOVA
 Use the simplest possible visualization.
 Include a title & label all axes.
 Include a scale for each axis if the chart contains axes.
 Begin the scale for a vertical axis at zero & use a
constant scale.
 Avoid 3D or “exploded” effects & the use of chartjunk.
 Use consistent colorings in charts meant to be compared.
 Avoid using uncommon chart types including radar,
surface, bubble, cone, and pyramid charts.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 33


Chapter Summary
In this chapter we covered:
 Organizing and visualizing categorical variables.
 Organizing and visualizing numerical variables.
 Summarizing a mix of variables.
 Avoiding common errors when organizing and
visualizing variables.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 34

You might also like