0% found this document useful (0 votes)
5 views

EBA2123 2.Tables and Graphs (1)

The document provides an overview of descriptive statistics, focusing on summarizing categorical and quantitative data through various methods such as frequency distributions, bar charts, and pie charts. It includes examples, guidelines for creating frequency distributions, and insights gained from data visualization. The content is aimed at helping readers understand how to effectively present and analyze statistical data.

Uploaded by

XIAO LA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

EBA2123 2.Tables and Graphs (1)

The document provides an overview of descriptive statistics, focusing on summarizing categorical and quantitative data through various methods such as frequency distributions, bar charts, and pie charts. It includes examples, guidelines for creating frequency distributions, and insights gained from data visualization. The content is aimed at helping readers understand how to effectively present and analyze statistical data.

Uploaded by

XIAO LA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 67

Statistics for Business

and Economics
Anderson Sweeney
Williams
Slides by
John Loucks
St. Edward’s University

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
1
or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics:
Tabular and Graphical Presentations

 1. Summarizing Categorical
 Data
2. Summarizing Quantitative
 Data
3. Exploratory Data Analysis: Stem-and-Leaf
Display
 4. Cross tabulation and Scatter Diagram

Categorical
Categorical data
data use
use labels
labels oror names
names
to
to identify
identify categories
categories of
of like
like items.
items.

Quantitative
Quantitative data
data are
are numerical
numerical values
values
that
that indicate
indicate how
how much
much or
or how
how many.
many.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
2
or duplicated, or posted to a publicly accessible website, in whole or in part.
1. Summarizing Categorical Data

 1.1 Frequency Distribution


 1.2 Relative Frequency
 Distribution
1.3 Percent Frequency
 Distribution
1.4 Bar Chart
 1.5 Pie Chart

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
3
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.1 Frequency Distribution

A
A frequency
frequency distribution
distribution is
is aa tabular
tabular summary
summary of of
data
data showing
showing the
the frequency
frequency (or(or number)
number) of
of items
items
in
in each
each of
of several
several non-overlapping
non-overlapping classes.
classes.

The
The objective
objective is
is to
to provide
provide insights
insights about
about the
the data
data
that
that cannot
cannot be
be quickly
quickly obtained
obtained by
by looking
looking only
only at
at
the
the original
original data.
data.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
4
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Example: Marada Inn


Guests staying at Marada Inn were asked to rate
the
quality of their accommodations as being
excellent,
above average, average, below average, or
poor.Average
Below The Average Above Average
Above Average
ratings provided byAbove
a sample of 20Above
Average Average
guests are:
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
5
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Example: Marada Inn

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
6
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.2 Relative Frequency Distribution

The
The relative
relative frequency
frequency of of aa class
class is
is the
the fraction
fraction or
or
proportion
proportion of
of the
the total
total number
number of of data
data items
items
belonging
belonging to
to the
the class.
class.

A
A relative
relative frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the relative
relative
frequency
frequency forfor each
each class.
class.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
7
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.3 Percent Frequency Distribution

The
The percent
percent frequency
frequency of
of aa class
class is
is the
the relative
relative
frequency
frequency multiplied
multiplied by
by 100.
100.

A
A percent
percent frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the percent
percent
frequency
frequency for
for each
each class.
class.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
8
or duplicated, or posted to a publicly accessible website, in whole or in part.
Relative Frequency and
Percent Frequency Distributions
 Example: Marada Inn

Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100) =
10
Above Average .45 45
Excellent .05 5
Total 1.00 100

1/20 = .05
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
9
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.4 Bar Chart

 A bar chart is a graphical device for depicting


qualitative data.
 On one axis (usually the horizontal axis), we specify
the labels that are used for each of the classes.
 A frequency, relative frequency, or percent frequenc
scale can be used for the other axis (usually the
vertical axis).
 Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
 The bars are separated to emphasize the fact that each
class is a separate category.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
10
or duplicated, or posted to a publicly accessible website, in whole or in part.
Bar Chart

10 Marada Inn Quality Ratings


9
8
7
Frequency

6
5
4
3
2
1
Rating
Poor Below Average Above Excellent
Average Average

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
11
or duplicated, or posted to a publicly accessible website, in whole or in part.
Pareto Diagram

 In quality control, bar charts are used to identify the


most important causes of problems.
 When the bars are arranged in descending order of
height from left to right (with the most frequently
occurring cause appearing first) the bar chart is
called a Pareto diagram.
 This diagram is named for its founder, Vilfredo
Pareto, an Italian economist.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
12
or duplicated, or posted to a publicly accessible website, in whole or in part.
1.5 Pie Chart

 The pie chart is a commonly used graphical device


for presenting relative frequency and percent
frequency distributions for categorical data.
 First draw a circle; then use the relative frequencie
to subdivide the circle into sectors that correspond
the relative frequency for each class.
 Since there are 360 degrees in a circle, a class with
relative frequency of .25 would consume .25(360) =
degrees of the circle.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
13
or duplicated, or posted to a publicly accessible website, in whole or in part.
Pie Chart

Marada Inn Quality


Excellent Ratings
5%
Poor
10%
Below
Average
Above 15%
Average
45%
Average
25%

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
14
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Marada Inn

 Insights Gained from the Preceding Pie Chart


• One-half of the customers surveyed gave Marada
a quality rating of “above average” or “excellen
(looking at the left side of the pie). This might
please the manager.
• For each customer who gave an “excellent” ratin
there were two customers who gave a “poor”
rating (looking at the top of the pie). This should
displease the manager.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
15
or duplicated, or posted to a publicly accessible website, in whole or in part.
2. Summarizing Quantitative Data

 2.1 Frequency Distribution


 2.2 Relative Frequency and
Percent Frequency
 Distributions
2.3 Dot Plot
 2.4 Histogram
 2.5 Cumulative Distributions
 2.6 Ogive

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
16
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.1 Frequency Distribution

 Example: Hudson Auto Repair


The manager of Hudson Auto would like to
gain a
better understanding of the cost of parts used in
the
engine tune-ups performed in the shop. She
examines
50 customer invoices for tune-ups. The costs of
parts,
rounded to the nearest dollar, are listed on the
next
slide.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
17
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Example: Hudson Auto Repair


Sample of Parts Cost($) for 50 Tune-
ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
18
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

The three steps necessary to define the classes for a


frequency distribution with quantitative data are:
1. Determine the number of non-overlapping classe
2. Determine the width of each class.
3. Determine the class limits.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
19
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Guidelines for Determining the Number of


• Use between 5 and 20 classes.
Classes
• Data sets with a larger number of elements
usually require a larger number of classes.
• Smaller data sets usually require fewer classes.

The
The goal
goal is
is to
to use
use enough
enough classes
classes to
to show
show the
the
variation
variation in
in the
the data,
data, but
but not
not soso many
many classes
classes
that
that some
some contain
contain only
only aa few
few data
data items.
items.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
20
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Guidelines for Determining the Width of Each


• Class
Use classes of equal width.
• Approximate Class Width =
Largest Data Value  Smallest Data Value
Number of Classes
Class Grade Frequency
80-100 A 13
Making
Making the
the classes
classes the
the same
same 75-79 A- 13
width
width reduces
reduces the
the chance
chance of
of 65-69 B 11
60-65 B- 15
inappropriate
inappropriate interpretations.
interpretations. 70-74 B+ 14
50-54 C 8
45-49 C- 6
55-59 C+ 3
40-44 D 2
0-39 F 0
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
21
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Note on Number of Classes and Class Width


• In practice, the number of classes and the
appropriate class width are determined by trial
and error.
• Once a possible number of classes is chosen, the
appropriate class width is found.
• The process can be repeated for a different
number of classes.
• Ultimately, the analyst uses judgment to
determine the combination of the number of
classes and class width that provides the best
frequency distribution for summarizing the data.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
22
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Guidelines for Determining the Class Limits


• Class limits must be chosen so that each data
item belongs to one and only one class.
• The lower class limit identifies the smallest
< 59
possible data value assigned to the class.
< 69
• The upper class limit identifies the largest< 79
possible data value assigned to the class.< 89
< 99
• The appropriate values for the class limits
< 109
depend on the level of accuracy of the data.

An
An open-end
open-end class
class requires
requires only
only aa
lower
lower class
class limit
limit or
or an
an upper
upper class
class limit.
limit.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
23
or duplicated, or posted to a publicly accessible website, in whole or in part.
Frequency Distribution

 Example: Hudson Auto Repair


If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5   1
Lower
class
Parts Cost ($) Frequency
limit
50-59 2
60-69 13
Upper class limit 70-79 16
80-89 7
90-99 7
100-109 5
Total 50

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
24
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.2 Relative Frequency and
Percent Frequency Distributions
 Example: Hudson Auto Repair

Parts Relative Percent


Cost ($) Frequency Frequency
50-59 .04 4
60-69 .26 2/50 26 .04(10
70-79 .32 32 0)
80-89 .14 14 Percent
Percent
frequency
frequency is is
90-99 .14 14 the
the relative
relative
100-109 .10 10 frequency
frequency
Total 1.00 100 multiplied
multiplied
by
by 100.
100.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
25
or duplicated, or posted to a publicly accessible website, in whole or in part.
Relative Frequency and
Percent Frequency Distributions
 Example: Hudson Auto Repair
Insights Gained from the % Frequency
• Distribution:
Only 4% of the parts costs are in the $50-59 class
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
26
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.3 Dot Plot

 One of the simplest graphical summaries of


data is a dot plot.
 A horizontal axis shows the range of data
 values.
Then each data value is represented by a dot
placed above the axis.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
27
or duplicated, or posted to a publicly accessible website, in whole or in part.
Dot Plot

 Example: Hudson Auto Repair

Tune-up Parts Cost

50 60 70 80 90 100 110
Cost ($)

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
28
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.4 Histogram

 Another common graphical presentation of


quantitative data is a histogram.
 The variable of interest is placed on the horizontal
axis.
 A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency
relative frequency, or percent frequency.
 Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
29
or duplicated, or posted to a publicly accessible website, in whole or in part.
Histogram

 Example: Hudson Auto Repair


18
Tune-up Parts Cost
16
14
Frequency

12
10
8
6
4
2
Parts
Cost ($)
50-59 60-69 70-79 80-89 90-99 100-110
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
30
or duplicated, or posted to a publicly accessible website, in whole or in part.
Histograms Showing Skewness

 Symmetric
• Left tail is the mirror image of the right tail
• Examples: heights and weights of people
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
31
or duplicated, or posted to a publicly accessible website, in whole or in part.
Histograms Showing Skewness

 Moderately Skewed Left


• A longer tail to the left
• Example: exam scores
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
32
or duplicated, or posted to a publicly accessible website, in whole or in part.
Histograms Showing Skewness

 Moderately Right Skewed


• A Longer tail to the right
• Example: housing values
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
33
or duplicated, or posted to a publicly accessible website, in whole or in part.
Histograms Showing Skewness

 Highly Skewed Right


• A very long tail to the right
• Example: executive salaries
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
34
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.5 Cumulative Distributions

Cumulative
Cumulative frequency
frequency distribution
distribution -- shows
shows the
the
number
number ofof items
items with
with values
values less
less than
than oror equal
equal to
to the
the
upper
upper limit
limit of
of each
each class..
class..

Cumulative
Cumulative relative
relative frequency
frequency distribution
distribution –– shows
shows
the
the proportion
proportion of
of items
items with
with values
values less
less than
than or
or
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.

Cumulative
Cumulative percent
percent frequency
frequency distribution
distribution –– shows
shows
the
the percentage
percentage ofof items
items with
with values
values less
less than
than or
or
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
35
or duplicated, or posted to a publicly accessible website, in whole or in part.
Cumulative Distributions

 The last entry in a cumulative frequency distribution


always equals the total number of observations.
 The last entry in a cumulative relative frequency
distribution always equals 1.00.
 The last entry in a cumulative percent frequency
distribution always equals 100.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
36
or duplicated, or posted to a publicly accessible website, in whole or in part.
Cumulative Distributions

 Hudson Auto Repair

Cumulative Cumulative
Cumulative Relative Percent
Cost ($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 2+ .62 15/50 62 .30(100
< 89 38 13 .76 )
76
< 99 45 .90 90
< 109 50 1.00 100
No of observation
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
37
or duplicated, or posted to a publicly accessible website, in whole or in part.
2.6 Ogive

 An ogive is a graph of a cumulative


 distribution.
The data values are shown on the horizontal
 axis.
Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
 The frequency (one of the above) of each class
is plotted as a point.
 The plotted points are connected by straight
lines.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
38
or duplicated, or posted to a publicly accessible website, in whole or in part.
Ogive

 Hudson Auto Repair


• Because the class limits for the parts-cost
data are 50-59, 60-69, and so on, there
appear to be one-unit gaps from 59 to 60,
• 69 to 70, and so on.
These gaps are eliminated by plotting points
halfway between the class limits.
• Thus, 59.5 is used for the 50-59 class, 69.5
is used for the 60-69 class, and so on.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
39
or duplicated, or posted to a publicly accessible website, in whole or in part.
Ogive with Cumulative Percent
Frequencies
 Example: Hudson Auto Repair
Cumulative Percent Frequency

100 Tune-up Parts Cost

80

60 (89.5,
76)
40

20
Parts
Cost ($)
50 60 70 80 90 100 110

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
40
or duplicated, or posted to a publicly accessible website, in whole or in part.
3 Exploratory Data Analysis

 The techniques of exploratory data analysis consist


simple arithmetic and easy-to-draw pictures that c
be used to summarize data quickly.
 One such technique is the stem-and-leaf display.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
41
or duplicated, or posted to a publicly accessible website, in whole or in part.
3.1 Stem-and-Leaf Display

 A stem-and-leaf display shows both the rank order


and shape of the distribution of the data.
 It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
 The first digits of each data item are arranged to th
left of a vertical line.
 To the right of the vertical line we record the last
digit for each item in rank order.
 Each line in the display is referred to as a stem.
 Each digit on a stem is a leaf.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
42
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Hudson Auto Repair

The manager of Hudson Auto would like to


gain a
better understanding of the cost of parts used in
the
engine tune-ups performed in the shop. She
examines
50 customer invoices for tune-ups. The costs of
parts,
rounded to the nearest dollar, are listed on the
next
slide.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
43
or duplicated, or posted to a publicly accessible website, in whole or in part.
Stem-and-Leaf Display

 Example: Hudson Auto Repair

Sample of Parts Cost ($) for 50 Tune-


91 78 ups
93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
44
or duplicated, or posted to a publicly accessible website, in whole or in part.
Stem-and-Leaf Display

 Example: Hudson Auto Repair

5 2 7
6 2 2 2 2 5 6 7 8 8 8 9 9 9
17 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
8 0 0 2 3 5 8 9
9 1 3 7 7 7 8 9
10 1 4 5 5 9

a stem
a leaf

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
45
or duplicated, or posted to a publicly accessible website, in whole or in part.
Stretched Stem-and-Leaf Display

 If we believe the original stem-and-leaf display has


condensed the data too much, we can stretch the
display vertically by using two stems for each
leading digit(s).
 Whenever a stem value is stated twice, the first va
corresponds to leaf values of 0 - 4, and the second
value corresponds to leaf values of 5 - 9.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
46
or duplicated, or posted to a publicly accessible website, in whole or in part.
Stretched Stem-and-Leaf Display

 Example: Hudson Auto Repair


5 2
5 7
6 2 2 2 2
6 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4
7 5 5 5 6 7 8 9 9 9
8 0 0 2 3
8 5 8 9
9 1 3
9 7 7 7 8 9
10 1 4
10 5 5 9
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
47
or duplicated, or posted to a publicly accessible website, in whole or in part.
Stem-and-Leaf Display

 Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed
to equal 1.
• The leaf unit indicates how to multiply the stem-
and-leaf numbers in order to approximate the
original data.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
48
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Leaf Unit = 0.1

If we have data with values such as


8.6
8.6 11.7
11.7 9.4
9.4 9.1
9.1 10.2
10.2 11.0
11.0 8.8
8.8

a stem-and-leaf display of these data will be

Leaf Unit = 0.1


8 6 8
9 1 4
10 2
11 0 7

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
49
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Leaf Unit = 10

If we have data with values such as


1806
1806 1717
1717 1974
1974 1791
1791 1682
1682 1910
1910 1838
1838

a stem-and-leaf display of these data will be

Leaf Unit = 10
16 8 The 82 in 1682
17 1 9 is rounded down
18 0 3 to 80 and is
represented as an
19 1 7 8.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
50
or duplicated, or posted to a publicly accessible website, in whole or in part.
4. Crosstabulations and Scatter Diagrams

 Thus far we have focused on methods that are used


to summarize the data for one variable at a time.
 Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables.
 Crosstabulation and a scatter diagram are two
methods for summarizing the data for two variable
simultaneously.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
51
or duplicated, or posted to a publicly accessible website, in whole or in part.
4.1 Crosstabulation

 A crosstabulation is a tabular summary of data for


two variables.
 Crosstabulation can be used when:
• one variable is qualitative and the other is
quantitative,
• both variables are qualitative, or
• both variables are quantitative.
 The left and top margin labels define the classes fo
the two variables.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
52
or duplicated, or posted to a publicly accessible website, in whole or in part.
Crosstabulation

 Example: Finger Lakes Homes


The number of Finger Lakes homes sold for
each
style andquantitative
price for the past two years is shown
categorical
below. variable variable
Price Home Style
Range Colonial Log Split A-Frame Total
< $200,00018 6 19 12 55
> $200,00012 14 16 3 45

Total 30 20 35 15 100

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
53
or duplicated, or posted to a publicly accessible website, in whole or in part.
Crosstabulation

 Example: Finger Lakes Homes


Insights Gained from Preceding Crosstabulation
• The greatest number of homes (19) in the sampl
are a split-level style and priced at less than
$200,000.
• Only three homes in the sample are an A-Frame
style and priced at $200,000 or more.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
54
or duplicated, or posted to a publicly accessible website, in whole or in part.
Crosstabulation
Frequency
distribution
 Example: Finger Lakes Homes
for the
price range
variable

Price Home Style


Range Colonial Log Split A-Frame Total
< $200,000 18 6 19 12 55
> $200,000 12 14 16 3 45

Total 30 20 35 15 100

Frequency distribution
for
the home style variable
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
55
or duplicated, or posted to a publicly accessible website, in whole or in part.
Crosstabulation: Row or Column
Percentages
 Converting the entries in the table into row
percentages or column percentages can
provide additional insight about the
relationship between the two variables.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
56
or duplicated, or posted to a publicly accessible website, in whole or in part.
Crosstabulation: Row Percentages

 Example: Finger Lakes Homes

Price Home Style


Range Colonial Log Split A-Frame Total
< $200,000 32.73 10.91 34.55 100
21.82
> $200,000 26.67 31.11 35.56 6.67
100

Note: row totals are actually 100.01 due to rounding.

(Colonial and > $200K)/(All > $200K) x 100 = (12/45) x 100

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
57
or duplicated, or posted to a publicly accessible website, in whole or in part.
Crosstabulation: Column Percentages

 Example: Finger Lakes Homes

Price Home Style


Range Colonial Log Split A-Frame
< $200,000 60.00 30.00 54.29 80.00
> $200,000 40.00 70.00 45.71 20.00
Total 100 100 100 100

(Colonial and > $200K)/(All Colonial) x 100 = (12/30) x 100

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
58
or duplicated, or posted to a publicly accessible website, in whole or in part.
4.2 Scatter Diagram and Trendline

 A scatter diagram is a graphical presentation of the


relationship between two quantitative variables.
 One variable is shown on the horizontal axis and
the other variable is shown on the vertical axis.
 The general pattern of the plotted points suggests
the overall relationship between the variables.
 A trendline provides an approximation of the
relationship.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
59
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scatter Diagram

 A Positive Relationship

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
60
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scatter Diagram

 A Negative Relationship

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
61
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scatter Diagram

 No Apparent Relationship

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
62
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scatter Diagram

 Example: Panthers Football Team


The Panthers football team is interested in
investigating the relationship, if any, between
interceptions made and points scored.

x = Number of y = Number of
Interceptions Points Scored
1 14
3 24
2 18
1 17
3 30
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
63
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scatter Diagram

y
Number of Points Scored 35
30
25
20
15
10
5
0 x
0 1 2 3 4
Number of Interceptions

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
64
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Panthers Football Team

 Insights Gained from the Preceding Scatter


•Diagram
The scatter diagram indicates a positive relations
between the number of interceptions and the
number of points scored.
• Higher points scored are associated with a higher
number of interceptions.
• The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
65
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scatter Diagram and Trendline

Scatter Diagram for the Panthers


35
30
Points Scored.

25
Number of

20
15
10
5
0
0 1 2 3 4
Number of Interceptions

© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
66
or duplicated, or posted to a publicly accessible website, in whole or in part.
Tabular and Graphical Methods
Data
Categorical Data Quantitative Data

Tabular Graphical Tabular Graphical


Methods Methods Methods Methods

• Frequency • Bar Chart • Frequency • Dot Plot


Distribution • Pie Chart Distribution • Histogram
• Rel. Freq. Dist. • Rel. Freq. Dist. • Ogive
• Percent Freq. • % Freq. Dist. • Stem-and-
Distribution • Cum. Freq. Dist. Leaf Display
• Crosstabulation • Cum. Rel. Freq. • Scatter
Distribution Diagram
• Cum. % Freq.
Distribution
• Crosstabulation
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
67
or duplicated, or posted to a publicly accessible website, in whole or in part.

You might also like