0% found this document useful (0 votes)
58 views

Chapter 2 - Describing The Data

This chapter aims to introduce students to constructing and interpreting frequency distributions, histograms, bar charts, pie charts, stem-and-leaf diagrams, and line and scatter plots as ways to describe data. A frequency distribution lists the values of a variable and the corresponding frequencies with which each value occurs. It condenses raw data into a more useful form for quick visual interpretation. Frequency distributions can describe discrete or continuous data by grouping the latter into classes.

Uploaded by

Azam Maulana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Chapter 2 - Describing The Data

This chapter aims to introduce students to constructing and interpreting frequency distributions, histograms, bar charts, pie charts, stem-and-leaf diagrams, and line and scatter plots as ways to describe data. A frequency distribution lists the values of a variable and the corresponding frequencies with which each value occurs. It condenses raw data into a more useful form for quick visual interpretation. Frequency distributions can describe discrete or continuous data by grouping the latter into classes.

Uploaded by

Azam Maulana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Chapter Goals

After completing this chapter, you should be


Engineering Statistics able
bl to:
t
• Construct a frequency distribution both manually
and with a computer.
• Construct
C t t and
d interpret
i t t a histogram.
hi t
Chapter 2
Graphs Charts
Graphs, Charts, and Tables – • Create and interpret bar charts
charts, pie charts
charts, and
Describing Your Data stem-and-leaf diagrams.
• Present and interpret data in line charts and
scatter diagrams.
diagrams
1 of 35 2 of 35

Frequency Distributions Why Use Frequency Distributions?

What is a Frequency Distribution? • A frequency


f distribution
di t ib ti iis a way tto
• A frequency
q y distribution is a list or a table … summarize data.

containing the values of a variable (or a set of • The distribution condenses the raw data
ranges within which the data falls) ... into a more useful form...
and
d th
the corresponding
di ffrequencies
i with
ith which
hi h andd allows
ll ffor a quick
i k visual
i l iinterpretation
t t ti
each value occurs (or frequencies with which of the data.
data falls within each range).

3 of 35 4 of 35
Frequency Distribution:
F Di t ib ti R l ti F
Relative Frequency
Discrete Data
Relative Frequency: What proportion is in each category?
• Discrete data: p
possible values are countable
Number of Relative
Number of cars Frequency Frequency
Example: the cars observed Frequency 44
number of cars
0 44
0 44 .22 = .22
22
1 24 200
waiting for right 1 24 .12
2 18 22% of the
turn at a certain 2 18 .09 observations
3 16
intersection has 3 16 .08 report that there
4 20 is no car waiting
b
been observed.
b d 4 20 .10 for turn right.
5 22
6 26 5 22 .11
7 30 6 26 .13
Total 200 7 30 .15
Total 200 1.00
5 of 35 6 of 35

Frequency Distribution:
Continuous Data Grouping Data by Classes
Sort raw data in ascending order:
• Continuous Data: may take on any value in
12 13
12, 13, 17
17, 21
21, 24
24, 24
24, 26
26, 27
27, 27
27, 30
30, 32
32, 35
35, 37
37, 38
38, 41
41, 43
43, 44
44, 46
46, 53
53, 58
some interval.
• Find range:
g 58 - 12 = 46
Example:
E l A manufacturer
f t off insulation
i l ti randomly
d l selects
l t
20 winter days and records the daily high temperature • Select number of classes: 5 (usually between 5 and 20)
• C
Computet class
l width:
idth 10 (46/5 then round off)
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, • Determine class boundaries:10, 20, 30, 40, 50
32, 13, 12, 38, 41, 43, 44, 27, 53, 27 • Compute class midpoints: 15, 25, 35, 45, 55
• Count observations & assign
g to classes.
(Temperature is a continuous variable because it could
be measured to any degree of precision desired).

7 of 35 8 of 35
Frequency Distribution Example Histograms
Data in ordered array:
12 13
12, 13, 17
17, 21
21, 24
24, 24
24, 26
26, 27
27, 27
27, 30
30, 32
32, 35
35, 37
37, 38
38, 41
41, 43
43, 44
44, 46
46, 53
53, 58 • The classes or intervals are shown on the
horizontal axis.
Frequency Distribution

Class Frequency Relative • frequency


f i measured
is d on th
the vertical
ti l axis.
i
Frequency
10 but under 20 3 .15 • B
Bars off the
th appropriate
i t heights
h i ht can be
b used
d
20 but under 30 6 .30 to represent the number of observations
30 but under 40 5 .25 within each class.
40 but under 50 4 .20
50 but under 60 2 .10
• Such a graph is called a histogram.
Total 20 1
1.00
00
9 of 35 10 of 35

Histogram Example Questions for Grouping Data


into Classes
Data in ordered array:
12 13
12, 13, 17
17, 21
21, 24
24, 24
24, 26
26, 27
27, 27
27, 30
30, 32
32, 35
35, 37
37, 38
38, 41
41, 43
43, 44
44, 46
46, 53
53, 58 • 1. How wide should each interval be?
(How many classes should be used?)
His togr am
7 6 • 2. How should the endpoints of the
6 5 intervals be determined?
5 4 No gaps • Often answered by trial and error, subject to
Frequency

4 user judgment
3 between
3 2 bars, since • The goal is to create a distribution that is
2
continuous neither too "jagged" nor too "blocky”
1 0 0 d t
data • Goal is to appropriately show the pattern of
0
5 15 25 36 45 55 More
variation in the data.
Class Midpoints
11 of 35 12 of 35
How Many Class Intervals? General Guidelines
• Many
y ((Narrow class intervals)) • Number of Data Points Number of Classes
3.5

• may yield a very jagged 3


2.5
under 50 5- 7
distribution with gaps from empty

quency
2
15
1.5 50 – 100 6 - 10

Freq
classes
l 1

• Can give a poor indication of how


0.5
0
100 – 250 7 - 12
over 250 10 - 20

4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
ore
frequency varies across classes.
classes

Mo
Temperature

12
– Class widths can typically be reduced as the
• Few (Wide class intervals) 10
8
number of observations increases.

Frequency
6
• may compress variation too much 4
– Distributions with numerous observations are more

F
and
d yield
i ld a blocky
bl k didistribution
ib i 2

• can obscure important patterns of


0
0 30 60 More
likely to be smooth and have gaps filled since data
variation
variation. p
Temperature
are plentiful
plentiful.
(X axis labels are upper class endpoints)
13 of 35 14 of 35

Class Width Stem and Leaf Diagram


• The class width is the distance between the
lowest possible value and the highest possible • A simple way to see distribution details in a
value
l ffor a ffrequency class.
l data set.
set

• The minimum class width is METHOD: Separate the sorted data series
Largest Value - Smallest Value into leading digits (the stem) and
W =
Number of Classes the trailing digits (the leaves).

15 of 35 16 of 35
Example: Example:
Data in ordered array: Data in ordered array:
12 13
12, 13, 17
17, 21
21, 24
24, 24
24, 26
26, 27
27, 27
27, 30
30, 32
32, 35
35, 37
37, 38
38, 41
41, 43
43, 44
44, 46
46, 53
53, 58 12 13
12, 13, 17
17, 21
21, 24
24, 24
24, 26
26, 27
27, 27
27, 30
30, 32
32, 35
35, 37
37, 38
38, 41
41, 43
43, 44
44, 46
46, 53
53, 58

• Here,
Here use the 10’s
10 s digit for the stem unit: • Completed Stem-and-leaf
Stem and leaf diagram:
Stem Leaf Stem Leaves

• 12 is shown as 1 2 3 7
1 2
2 1 4 4 6 7 8
3 0 2 5 7 8
• 35 is shown as 3 5
4 1 3 4 6
5 3 8

17 of 35 18 of 35

Graphing Categorical Data Bar and Pie Charts

Categorical • B
Bar charts
h andd Pi
Pie charts
h are often
f
Data used for qualitative (category) data.

• Height of bar or size of pie slice


Pie
Pi Bar
B Pareto
P t shows the frequency or percentage
Charts Charts Diagram for each category.

19 of 35 20 of 35
Pi Chart
Pie Ch E Example
l Bar Chart Example
Current Investment Portfolio
Investor's
Investor s Portfolio
Investment Amount Percentage Savings
Type (in thousands $)
15%
Stocks 46.5 42.27 Stocks
St k S i
Savings
Bonds 32.0 29.09 42%
CD CD
CD 15 5
15.5 14
14.09
09 14%
Savings 16.0 14.55 Bonds
Total 110 100
Stocks
Bonds Percentages
(Variables are Qualitative)
29% are rounded to 0 10 20 30 40 50
the nearest
percent
Amount in $1000's

21 of 35 22 of 35

Pareto Diagram Example Bar Chart Example


45% 100%

Number of Frequency
90% 50
40% vehicles
ategory
y

80% 0 44
cumulative % invested

35%
40
1 24
sted in each ca

70%
30%
graph)

2 18
(line graph)

Frequency
60% 30
25% 3 16
g
(bar g

50%
20%
4 20 20
40%
5 22
15%
% inves

30% 6 26 10
10%
20% 7 30
0
5%
Total 200
10% 0 1 2 3 4 5 6 7

0% 0%
Number of vehicles turn right
Stocks Bonds Savings CD

23 of 35 24 of 35
Tabulating
Tab lating and Graphing Tabulating and Graphing
g
Multivariate Categorical Data Multivariate Categorical Data
( ti
(continued)
d)

• Investment in thousands of dollars: • Side by side charts


C omparing Investors
Investment Investor A Investor B Investor C Total
C t
Category
Savings
Stocks 46.5 55 27.5 129
CD
Bonds 32.0 44 19.0 95
Bonds
CD 15 5
15.5 20 13
13.5
5 49
Savings 16.0 28 7.0 51 Stocks

Total 110.0 147 67.0 324 0 10 20 30 40 50 60

Investor A Investor B Investor C

25 of 35 26 of 35

Sid b Sid Ch
Side-by-Side Chartt E
Example
l Line Charts and
g
Scatter Diagrams
• Sales by quarter for three sales territories:
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Ea st
W e st
20.4
30.6
27.4
38.6
59
34.6
20.4
31.6
• Li
Line charts
h t show
h values
l off one variable
i bl
North 45.9 46.9 45 43.9 vs. time
60 – Ti
Time is
i traditionally
t diti ll shown
h on th
the h
horizontal
i t l
axis.
50

40
East • Scatter Diagrams show points for bivariate
30 West data
North – one variable is measured on the vertical axis
20
and the other variable is measured on the
10 horizontal axis.
0
1 t Qtr
1st Qt 2 d Qt
2nd Qtr 3 d Qt
3rd Qtr 4th Qtr
Qt
27 of 35 28 of 35
Line Chart Example Scatter Diagram Example
Inflation
Year Production Volume vs. Cost pper Dayy
Rate U S Inflation Rate
U.S. Volume Cost per
1985 3.56 6 per day day
1986 1.86
1987 3.65
%)
I nflation Rate (%
5 23 125 250
1988 4.14
4 26 140 200
1989 4.82

Co st p er Day
1990 5.40 29 146
3 150
1991 4 21
4.21
1992 3.01
33 160
2
1993 2.99 38 167 100
1994 2.56 1
1995 2.83 42 170 50
1996 2.95 0
50 188 0
1997 2.29 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002
1998 1.56 55 195 0 10 20 30 40 50 60 70
1999 2.21 Year
60 200
2000 3.36 Volume per Day
2001 2.85
2002 1 58
1.58

29 of 35 30 of 35

Types of Relationships Types of Relationships ((continued))

• Linear Relationships • Curvilinear Relationships


Y Y Y Y

X X X X

31 of 35 32 of 35
Types off Relationships Chapter Summary
((continued))
• Data in raw form are usually not easy to use
• No Relationship for decision making -- Some type of
organization is needed:
Y Y
♦ Table
T bl ♦ Graph
G h
• Techniques
q reviewed in this chapter:
p
– Frequency Distributions and Histograms
– Bar
B Charts
Ch t and d Pi
Pie Ch
Charts
t
– Stem and Leaf Diagrams
X X – Line Charts and Scatter Diagrams.

33 of 35 34 of 35

Thank You

35 of 35

You might also like