Week 2-4-Organizing and Visualizing Variables
Week 2-4-Organizing and Visualizing Variables
https://www.webdesignerdepot.com/2009/06/50-great-examples-of-data-visualization/
ORGANIZING
1 CATEGORICAL Variables
Main Reason Young Adults Shop Online
Reason For Shopping Online? Percent
Better Prices 37%
Avoiding holiday crowds or hassles 29%
Convenience 18%
Better selection 13%
Ships directly 3%
Source: Data extracted and adapted from “Main Reason Young Adults Shop Online?” USA Today, December 5, 2012, p. 1A.
83.75% of sampled invoices have no errors and 47.50% of sampled invoices are for
small amounts.
Contingency Table based on Percentage Of Rows Total
No
Errors Errors Total 89.47% = 170 / 190
Small 170 20 190 71.43% = 100 / 140
Amount 92.86% = 65 / 70
Medium 100 40 140
Amount
No
Large 65 5 70 Errors Errors Total
Amount
Small 89.47% 10.53% 100.0%
Total 335 65 400 Amount
Medium 71.43% 28.57% 100.0%
Amount
Findings Large 92.86% 7.14% 100.0%
Amount
Total 83.75% 16.25% 100.0%
Medium invoices have a larger chance (28.57%) of having errors than small (10.53%) or
large (7.14%) invoices.
Contingency Table based on Percentage Of Columns Total
No
Errors Errors Total 50.75% = 170 / 335
Small 170 20 190 30.77% = 20 / 65
Amount
Medium 100 40 140
Amount
No
Large 65 5 70 Errors Errors Total
Amount
Small 50.75% 30.77% 47.50%
Total 335 65 400 Amount
Medium 29.85% 61.54% 35.00%
Amount
Findings Large 19.40% 7.69% 17.50%
Amount
Total 100.0% 100.0% 100.0%
There is a 61.54% chance that invoices with errors are of medium size.
A Contingency Table – WHY?
1. Study patterns that may exist between the responses of two or
more categorical variables
2. Cross tabulates or tallies jointly the responses of the
categorical variables
3. For two variables the tallies for one variable are located in the
rows and the tallies for the second variable are located in the
columns
ORGANIZING
1 CATEGORICAL Variables
Categorical Data
Tallying Data
One Two/More
Categorical Categorical
Variable Variables
Summary Contingency
Table Table
ORGANIZING
2 NUMERICAL Variables
Age of Surveyed College Students in Ascending Order
Day Students
16 17 17 18 18 18
19 19 20 20 21 22
22 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45
Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage
Relative
Class Frequency Percentage
Frequency
10 but less than 20 3 .15 15%
20 but less than 30 6 .30 30%
30 but less than 40 5 .25 25%
40 but less than 50 4 .20 20%
50 but less than 60 2 .10 10%
Total 20 1.00 100%
ORGANIZING
2 NUMERICAL Variables
Numerical Data
Summary Contingency
Table For One Table For Two
Variable Variables
Invoices with errors are much more likely to be of medium size (61.54% vs 30.77%
and 7.69%)
VISUALIZING
3 CATEGORICAL Variables
Categorical
Data
Visualizing Data
Summary Contingency
Table For One Table For Two
Variable Variables
8
(In a percentage histogram the
Histogram: Age Of Students
vertical axis would be defined
6
Frequency
to show the percentage of
observations per class)
0
5 15 25 35 45 55 More
The Frequency Polygon
Useful When Comparing Two or More Groups
The Percentage Polygon
The Polygon
1. A percentage polygon is formed by having the
midpoint of each class represent the data in that class
and then connecting the sequence of midpoints at their
respective class percentages.
2. The cumulative percentage polygon, or ogive,
displays the variable of interest along the X axis, and
the cumulative percentages along the Y axis.
3. Useful when there are two or more groups to compare.
Two Numerical
Variables
Scatter Time-
Plot Series
Plot
Scatter Plot
For numerical data consisting of paired observations taken from two numerical
variables and to examine possible relationships between two numerical variables
42 170
50
50 188 0
20 30 40 50 60 70
55 195
Volume per Day
60 200
Time Series Plot
to study patterns in the values of a numeric variable over time
Number of
Number of Franchises, 1996-2004
Year Franchises
120
1996 43
100
1997 54
Franchises
Number of 80
1998 60
60
1999 73
2000 82 40
2001 95 20
2002 107 0
2003 99 1994 1996 1998 2000 2002 2004 2006
2004 95 Year
Numerical Data
Frequency Distributions
Ordered Array and
Cumulative Distributions
Stem-and-Leaf
Histogram Polygon Ogive
Display
ORGANIZING VISUALIZING
1 2 3 4
TABLE CHART