0% found this document useful (0 votes)
2 views

Lecture Data Vizualization

Chapter 3 of the document discusses data visualization, emphasizing the importance of effective design techniques such as the data-ink ratio for creating meaningful tables and charts. It covers the use of tables for precise numerical comparisons and introduces crosstabulation and PivotTables in Excel as tools for data analysis. Additionally, the chapter highlights various chart types, including scatter and line charts, for visually representing data relationships and trends.

Uploaded by

Dmitry Paul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture Data Vizualization

Chapter 3 of the document discusses data visualization, emphasizing the importance of effective design techniques such as the data-ink ratio for creating meaningful tables and charts. It covers the use of tables for precise numerical comparisons and introduces crosstabulation and PivotTables in Excel as tools for data analysis. Additionally, the chapter highlights various chart types, including scatter and line charts, for visually representing data relationships and trends.

Uploaded by

Dmitry Paul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Data Visualization

Chapter 3

© 2019 Cengage. All Rights Reserved.


Introduction
• Data visualization involves:
• Creating a summary table for the data.
• Generating charts to help interpret, analyze, and learn from the data.
• Uses of data visualization:
• Helpful for identifying data errors.
• Reduces the size of your data set by highlighting important
relationships and trends in the data.

© 2019 Cengage. All Rights Reserved.


Overview of Data Visualization
Effective Design Techniques

© 2019 Cengage. All Rights Reserved.


Overview of Data Visualization (Slide 1 of 5)
Effective Design Techniques:
• Data-ink ratio: Measures the proportion of what Tufte terms “data-ink”
to the total amount of ink used in a table or chart.
• Edward R. Tufte first described the data-ink ratio.
• Helpful for creating effective tables and charts for data visualization:
• Data-ink: Ink used in a table or chart that is necessary to convey the
meaning of the data to the audience.
• Non-data-ink: Ink used in a table or chart that serves no useful purpose in
conveying the data to the audience.

© 2019 Cengage. All Rights Reserved.


Overview of Data Visualization (Slide 2 of 5)
Table 3.1: Example Scarf Sales by Day

of a Low Data-Ink Day Sales Day Sales

Ratio Table 1 150 11 170


2 170 12 160
3 140 13 290
4 150 14 200
5 180 15 210
6 180 16 110
7 210 17 90
8 230 18 140
9 140 19 150
10 200 20 230

© 2019 Cengage. All Rights Reserved.


Overview of Data Visualization (Slide 3 of 5)
Figure 3.3: Example
of a Low Data-Ink
Ratio Chart

© 2019 Cengage. All Rights Reserved.


Overview of Data Visualization (Slide 4 of 5)
Table 3.2: Increasing Scarf Sales by Day

the Data-Ink Ratio by Day Sales Day Sales

Removing 1 150 11 170


Unnecessary 2 170 12 160

Gridlines 3 140 13 290


4 150 14 200
5 180 15 210
6 180 16 110
7 210 17 90
8 230 18 140
9 140 19 150
10 200 20 230

© 2019 Cengage. All Rights Reserved.


Overview of Data Visualization (Slide 5 of 5)
Figure 3.4:
Increasing the
Data-Ink Ratio by
Adding Labels to
Axes and Removing
Unnecessary Lines
and Labels

© 2019 Cengage. All Rights Reserved.


Tables
Table Design Principles
Crosstabulation
PivotTables in Excel
Recommended PivotTables in Excel

© 2019 Cengage. All Rights Reserved.


Tables (1 of 18)
Tables should be used when:
1. The reader needs to refer to specific numerical values.
2. The reader needs to make precise comparisons between different values
and not just relative comparisons.
3. The values being displayed have different units or very different
magnitudes.

© 2019 Cengage. All Rights Reserved.


Tables (2 of 18)
Table 3.3: Table Showing Exact Values for Costs and Revenues by Month
for Gossamer Industries
Month
1 2 3 4 5 6 Total
Costs ($) 48,123 56,458 64,125 52,158 54,718 50,985 326,567
Revenues ($) 64,124 66,128 67,125 48,178 51,785 55,687 353,027

© 2019 Cengage. All Rights Reserved.


Tables (3 of 18)
Figure 3.5: Line Chart of Monthly Costs and Revenues at Gossamer
Industries

© 2019 Cengage. All Rights Reserved.


Tables (4 of 18)
Figure 3.6: Combined Line Chart and Table for Monthly Costs and
Revenues at Gossamer Industries

© 2019 Cengage. All Rights Reserved.


Tables (5 of 18)
Table 3.4: Table Displaying Head Count, Costs, and Revenues at
Gossamer Industries
Month
1 2 3 4 5 6 Total
Head Count 8 9 10 9 9 9
Costs ($) 48,123 56,458 64,125 52,158 54,718 50,985 326,567
Revenues ($) 64,124 66,128 67,125 48,178 51,785 55,687 353,027

© 2019 Cengage. All Rights Reserved.


Tables (6 of 18)
Table Design Principles:
• Avoid using vertical lines in a table unless they are necessary for clarity.
• Horizontal lines are generally necessary only for separating column titles
from data values or when indicating that a calculation has taken place.

© 2019 Cengage. All Rights Reserved.


Tables (7 of 18)
Figure 3.7: Comparing Different Table Designs

© 2019 Cengage. All Rights Reserved.


Tables (8 of 18)
Table 3.5: Revenues by Location ($) Month 1 Month 2 Month 3 Month 4 Month 5 Month 6
Larger Table Temple 8,987 8,595 8,958 6,718 8,066 8,574
Showing Killeen 8,212 9,143 8,714 6,869 8,150 8,891
Revenues by Waco 11,603 12,063 11,173 9,622 8,912 9,553

Location for 12 Belton 7,671 7,617 7,896 6,899 7,877 6,621

Months of Data Granger 7,642 7,744 7,836 5,833 6,002 6,728


Harker Heights 5,257 5,326 4,998 4,304 4,106 4,980
Gatesville 5,316 5,245 5,056 3,317 3,852 4,026
Lampasas 5,266 5,129 5,022 3,022 3,088 4,289
Academy 4,170 5,266 7,472 1,594 1,732 2,025
Total 64,124 66,128 67,125 48,178 51,785 55,687
Costs ($) 48,123 56,458 64,125 52,158 54,718 50,985

© 2019 Cengage. All Rights Reserved.


Tables (9 of 18)
Table 3.5 Revenues by
(cont.) Location ($) Month 7 Month 8 Month 9 Month 10 Month 11 Month 12 Total
Temple 8,701 9,490 9,610 9,262 9,875 11,058 107,895
Killeen 8,766 9,193 9,603 10,374 10,456 10,982 109,353
Waco 11,943 12,947 12,925 14,050 14,300 13,877 142,967
Belton 7,765 7,720 7,824 7,938 7,943 7,047 90,819
Granger 7,848 7,717 7,646 7,620 7,728 8,013 88,357
Harker Heights 5,084 5,061 5,186 5,179 4,955 5,326 59,763
Gatesville 5,135 5,132 5,052 5,271 5,304 5,154 57,859
Lampasas 5,110 5,073 4,978 5,343 4,984 5,315 56,620
Academy 8,772 1,956 3,304 3,090 3,579 2,487 45,446
Total 69,125 64,288 66,128 68,128 69,125 69,258 759,079
Costs ($) 57,898 62,050 65,215 61,819 67,828 69,558 710,935

© 2019 Cengage. All Rights Reserved.


Tables (10 of 18)
• Crosstabulation: A useful type of table for describing data of two
variables.
• PivotTable: A crosstabulation in Microsoft Excel.

© 2019 Cengage. All Rights Reserved.


Tables (11 of 18)
Table 3.6: Quality Rating and Meal Price for 300 Los Angeles Restaurants
Restaurant Quality Rating Meal Price ($) Wait Time (min)
1 Good 18 5
2 Very Good 22 6
3 Good 28 1
4 Excellent 38 74
5 Very Good 33 6
6 Good 28 5
7 Very Good 19 11
8 Very Good 11 9
9 Very Good 23 13
10 Good 13 1

© 2019 Cengage. All Rights Reserved.


Tables (12 of 18)
Table 3.7: Crosstabulation of Quality Rating and Meal Price for 300 Los
Angeles Restaurants
Meal Price
Quality Rating $10–19 $20–29 $30–39 $40–49 Total
Good 42 40 2 0 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
Total 78 118 76 28 300
• The greatest number of restaurants in the sample (64) have a very good rating and a meal price in
the $20–29 range.
• Only two restaurants have an excellent rating and a meal price in the $10–19 range.
• The right and bottom margins of the crosstabulation give the frequencies of quality rating and
meal price separately.

© 2019 Cengage. All Rights Reserved.


Tables (13 of 18)
Figure 3.8: Excel Worksheet
Containing Restaurant Data

© 2019 Cengage. All Rights Reserved.


Tables (14 of 18)
Figure 3.9: Initial
PivotTable Field List and
PivotTable Field Report
for the Restaurant Data

© 2019 Cengage. All Rights Reserved.


Tables (15 of 18)
Figure 3.10:
Completed
PivotTable Field List
and a Portion of the
PivotTable Report
for the Restaurant
Data (Columns H:AK
Are Hidden)

© 2019 Cengage. All Rights Reserved.


Tables (16 of 18)
Figure 3.11: Final
PivotTable Report for
the Restaurant Data

© 2019 Cengage. All Rights Reserved.


Tables (17 of 18)
Figure 3.12: Percent
Frequency Distribution
as a PivotTable for the
Restaurant Data

© 2019 Cengage. All Rights Reserved.


Tables (18 of 18)
Figure 3.13:
PivotTable Report
for the Restaurant
Data with Average
Wait Times Added

© 2019 Cengage. All Rights Reserved.


Charts
Scatter Charts
Recommended Charts in Excel Bubble Charts
Line Charts Heat Maps
Bar Charts and Column Charts Additional Charts for Multiple Variables
A Note on Pie Charts and PivotCharts in Excel
Three-Dimensional Charts

© 2019 Cengage. All Rights Reserved.


Charts (1 of 26)
• Charts (or graphs): Visual methods of displaying data.
• Scatter chart: Graphical presentation of the relationship between
two quantitative variables.
• Trendline: A line that provides an approximation of the relationship
between the variables.
• Line chart: A line connects the points in the chart.
• Useful for time series data collected over a period of time (minutes,
hours, days, years, etc.).

© 2019 Cengage. All Rights Reserved.


Charts (2 of 26) No. of
Table 3.8: Sample Data for the Commercials Sales ($100s)

San Francisco Electronics Store Week x y


1 2 50
2 5 57
3 1 41
4 3 54
5 4 54
6 1 38
7 5 63
8 3 48
9 4 59
10 2 46

© 2019 Cengage. All Rights Reserved.


Charts (3 of 26)
Figure 3.17: Scatter Chart for the San Francisco Electronics Store

© 2019 Cengage. All Rights Reserved.


Charts (4 of 26)
Table 3.9: Monthly Sales Data of Air Compressors at Kirkland Industries
Month Sales ($100s)
Jan 150
Feb 145
Mar 185
Apr 195
May 170
Jun 125
Jul 210
Aug 175
Sep 160
Oct 120
Nov 115
Dec 120

© 2019 Cengage. All Rights Reserved.


Charts (5 of 26)
Figure 3.19: Scatter Chart and Line Chart for Monthly Sales Data at
Kirkland Industries

© 2019 Cengage. All Rights Reserved.


Charts (6 of 26) Sales ($100s) Sales ($100s)
Month North South
Table 3.10: Regional Sales Jan 95 40
Data by Month for Air Feb 100 45
Compressors at Kirkland Mar 120 55
Industries (Kirkland Regional Data) Apr 115 65
May 100 60
Jun 85 50
Jul 135 75
Aug 110 65
Sep 100 60
Oct 50 70
Nov 40 75
Dec 40 80

© 2019 Cengage. All Rights Reserved.


Charts (7 of 26)
Figure 3.21: Line Chart of Regional Sales Data at Kirkland Industries

© 2019 Cengage. All Rights Reserved.


Charts (8 of 26)
Sparkline: Special type of line chart:
• Minimalist type of line chart that can be placed directly into a cell in
Excel.
• Contains no axes; they display only the line for the data.
• Takes up very little space and can be effectively used to provide
information on overall trends for time series data.

© 2019 Cengage. All Rights Reserved.


Charts (9 of 26)
Figure 3.22: Sparklines for the Regional Sales Data at Kirkland Industries

© 2019 Cengage. All Rights Reserved.


Charts (10 of 26)
• Bar Charts: Use horizontal bars to display the magnitude of the
quantitative variable.
• Column Charts: Use vertical bars to display the magnitude of the
quantitative variable.
• Bar and column charts are very helpful in making comparisons
between categorical variables.

© 2019 Cengage. All Rights Reserved.


Charts (11 of 26)
Figure 3.23: Bar Charts
for Accounts Managed
Data

Gentry manages the


greatest number of
accounts and
Williams the fewest.

© 2019 Cengage. All Rights Reserved.


Charts (12 of 26)
Figure 3.24: Sorted
Bar Chart for
Accounts Managed
Data

© 2019 Cengage. All Rights Reserved.


Charts (13 of 26)
Figure 3.25: Bar Chart
with Data Labels for
Accounts Managed
Data

© 2019 Cengage. All Rights Reserved.


Charts (14 of 26)
• Pie chart: Common form of chart used to compare categorical data.
• Bubble chart: Graphical means of visualizing three variables in a
two-dimensional graph that sometimes is a preferred alternative to
a 3-D graph.
• Heat map: A two-dimensional graphical representation of data that
uses different shades of color to indicate magnitude.

© 2019 Cengage. All Rights Reserved.


Charts (15 of 26)
Figure 3.26: Pie Chart of Accounts Managed

© 2019 Cengage. All Rights Reserved.


Charts (16 of 26)
Table 3.11: Sample Data on Billionaires per Country

Billionaires per Per Capita No. of


Country 10M Residents Income Billionaires
United States 54.7 $54,600 1,764
China 1.5 $12,880 213
Germany 12.5 $45,888 103
India 0.7 $ 5,855 90
Russia 6.2 $24,850 88
Mexico 1.2 $17,881 15

© 2019 Cengage. All Rights Reserved.


Charts (17 of 26)
Figure 3.27: Bubble Chart
Comparing Billionaires by
Country

© 2019 Cengage. All Rights Reserved.


Charts (18 of 26)
Figure 3.28: Heat
Map and Sparklines
for Same-Store Sales
Data

© 2019 Cengage. All Rights Reserved.


Charts (19 of 26)
Additional Charts for Multiple Variables:
• Stacked-column chart: Allows the reader to compare the relative values
of quantitative variables for the same category in a bar chart.
• Clustered-column (or bar) chart: An alternative chart to stacked-column
chart for comparing quantitative variables.
• Scatter-chart matrix: Useful chart for displaying multiple variables.

© 2019 Cengage. All Rights Reserved.


Charts (20 of 26)
Figure 3.29: Stacked-
Column Chart for
Regional Sales Data for
Kirkland Industries

© 2019 Cengage. All Rights Reserved.


Charts (21 of 26)
Figure 3.30: Comparing
Stacked-, Clustered-,
and Multiple-Column
Charts for the Regional
Sales Data for Kirkland
Industries

© 2019 Cengage. All Rights Reserved.


Charts (22 of 26)
Table 3.12: Data for New York City Subboroughs
Median Percentage
Monthly Rent College Poverty Rate Travel Time
Area ($) Graduates (%) (%) (min)
Astoria 1,106 36.8 15.9 35.4
Bay Ridge 1,082 34.3 15.6 41.9
Bayside/Little Neck 1,243 41.3 7.6 40.6
Bedford Stuyvesant 822 21.0 34.2 40.5
Bensonhurst 876 17.7 14.4 44.0
Borough Park 980 26.0 27.6 35.3

© 2019 Cengage. All Rights Reserved.


Charts (23 of 26)
Table 3.12: Data for New York City Subboroughs (cont.)
Median Percentage
Monthly Rent College Poverty Rate Travel Time
Area ($) Graduates (%) (%) (min)
Brooklyn Heights/Fort
Greene 1,086 55.3 17.4 34.5
Brownsville/Ocean Hill 714 11.6 36.0 40.3
Bushwick 945 13.3 33.5 35.5
Central Harlem 665 30.6 27.1 25.0
Chelsea/Clinton/Midtown 1,624 66.1 12.7 43.7
Coney Island 786 27.2 20.0 46.3

© 2019 Cengage. All Rights Reserved.


Charts (24 of 26)
Figure 3.31: Scatter-
Chart Matrix for New
York City Rent Data

© 2019 Cengage. All Rights Reserved.


Charts (25 of 26)
PivotCharts in Excel:
PivotChart: To summarize and analyze data with both a crosstabulation
and charting, Excel pairs PivotCharts with PivotTables.

© 2019 Cengage. All Rights Reserved.


Charts (26 of 26)
Figure 3.32:
PivotTable and
PivotChart for the
Restaurant Data

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization
Advanced Charts
Geographic Information Systems Charts

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization (1 of 7)
Advanced Charts:
• Parallel-coordinates plot: Chart for examining data with more than two
variables:
• Includes a different vertical axis for each variable.
• Each observation is represented by drawing a line on the parallel-
coordinates plot connecting each vertical axis.
• The height of the line on each vertical axis represents the value taken by
that observation for the variable corresponding to the vertical axis.
• Treemap: Useful for visualizing hierarchical data along multiple
dimensions.

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization (2 of 7)
Figure 3.33: Parallel-
Coordinates Plot for
Baseball Data

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization (3 of 7)
Figure 3.34:
SmartMoney’s
Map of the Market
as an Example of a
Treemap

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization (4 of 7)
Figure 3.35:
Treemap Created in
Excel for Top 100
Global Companies
Data

© 2019 Cengage. All Rights Reserved.


Use the Global 100 Companies Data

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization (5 of 7)
Geographic Information Systems Charts:
• Geographic information system (GIS): A system that merges maps and
statistics to present data collected over different geographic areas.
• Helps in interpreting data and observing patterns.

© 2019 Cengage. All Rights Reserved.


Advanced Data Visualization (6 of 7)
Figure 3.36: GIS
Chart for Cincinnati
Zoo Member Data

© 2019 Cengage. All Rights Reserved.


Using the World GDP Data

© 2019 Cengage. All Rights Reserved.


© 2019 Cengage. All Rights Reserved.
Advanced Data Visualization (7 of 7)
Figure 3.38:
Completed 3D
Map Created in
Excel for World
GDP Data

© 2019 Cengage. All Rights Reserved.


Data Dashboards
Principles of Effective Data Dashboards
Applications of Data Dashboards

© 2019 Cengage. All Rights Reserved.


Data Dashboards (1 of 3)
• Data dashboard: Data-visualization tool that illustrates multiple
metrics and automatically updates these metrics as new data
become available.
Principles of Effective Data Dashboards:
• Key performance indicators (KPIs) in dashboards:
• Automobile dashboard: Current speed, Fuel level, and oil pressure.
• Business dashboard: Financial position, inventory on hand, customer
service metrics.

© 2019 Cengage. All Rights Reserved.


Data Dashboards (2 of 3)
Principles of Effective Data Dashboards (continued):
• Should provide timely summary information on KPIs that are important to the
user.
• Should present all KPIs as a single screen that a user can quickly scan to
understand the business’s current state of operations.
• The KPIs displayed in the data dashboard should convey meaning to its user and
be related to the decisions the user makes.
• A data dashboard should call attention to unusual measures that may require
attention.
• Color should be used to call attention to specific values to differentiate
categorical variables, but the use of color should be restrained.

© 2019 Cengage. All Rights Reserved.


Data Dashboards (3 of 3)
Figure 3.39: Data
Dashboard for the
Grogan Oil Information
Technology Call Center

© 2019 Cengage. All Rights Reserved.

You might also like