DATAENG Lesson 6a Descriptive Statistics (Self Study) Handout
DATAENG Lesson 6a Descriptive Statistics (Self Study) Handout
DATAENG
(Engineering Data Analysis)
1
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Sixth Edition
Douglas C. Montgomery George C. Runger
Chapter 6
Descriptive Statistics
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
1
6/26/2020
Sample Mean
2
6/26/2020
x i
12.6 + 12.9 + ... + 13.1
i
1
xi
12.6
x = average = i =1
= 2 12.9
8 8 3 13.4
104 4 12.3
= = 13.0 pounds 5 13.6
8 6 13.5
7 12.6
8 13.1
13.00
= AVERAGE($B2:$B9)
Variance Defined
3
6/26/2020
Table 6-1
4
6/26/2020
Computation of s2
The prior calculation is definitional and tedious. A
shortcut is derived here and involves just 2 sums.
10
5
6/26/2020
11
Degrees of Freedom
• The sample variance is calculated with the
quantity n-1.
• This quantity is called the “degrees of
freedom”.
• Origin of the term:
– There are n deviations from x-bar in the sample.
– The sum of the deviations is zero.
– n-1 of the observations can be freely determined,
but the nth observation is fixed to maintain the
zero sum.
12
6
6/26/2020
Sample Range
If the n observations in a sample are denoted
by x1, x2, …, xn, the sample range is:
13
Stem-and-Leaf Diagrams
• Dot diagrams (dotplots) are useful for small
data sets. Stem & leaf diagrams are better
for large sets.
• Steps to construct a stem-and-leaf diagram:
1) Divide each number (xi) into two parts: a stem,
consisting of the leading digits, and a leaf,
consisting of the remaining digit.
2) List the stem values in a vertical column.
3) Record the leaf for each observation beside its
stem.
4) Write the units for the stems and leaves on the
display.
14
7
6/26/2020
15
Quartiles
• The three quartiles partition the data into four equally sized counts
or segments.
– First or lower quartile : 25% of the data is less than q1.
– Second quartile : 50% of the data is less than q2, the median.
– Third or upper quartile : 75% of the data is less than q3.
Value of indexed
f Index item quartile
th th
i (i+1)
0.25 20.25 143 144 143.25
0.50 40.50 160 163 161.50
0.75 60.75 181 181 181.00
16
8
6/26/2020
17
Minitab Descriptives
• The Minitab selection menu:
Stat > Basic Statistics > Display Descriptive Statistics
calculates the descriptive statistics for a data
set.
• For the Table 6-2 data, Minitab produces:
Variable N Mean StDev
Strength 80 162.66 33.77
18
9
6/26/2020
Frequency Distributions
• A frequency distribution is a compact
summary of data, expressed as a table,
graph, or function.
• The data is gathered into bins or cells,
defined by class intervals.
• The number of classes, multiplied by the
class interval, should exceed the range of the
data. The square root of the sample size is a
guide.
• The boundaries of the class intervals should
be convenient values, as should the class
width.
19
Starting point = 70
20
10
6/26/2020
Histograms
• A histogram is a visual display of a frequency
distribution, similar to a bar chart or a stem-and-leaf
diagram.
21
22
11
6/26/2020
23
24
12
6/26/2020
25
(b) Symmetric distribution has identical mean, median and mode measures.
26
13
6/26/2020
27
28
14
6/26/2020
29
30
15
6/26/2020
31
Digidot Plot
Combining a time series plot with some of the other graphical displays that we
have considered previously will be very helpful sometimes. The stem-and-
leaf plot combined with a time series Plot forms a digidot plot.
Figure 6-17 A digidot plot of the compressive strength data in Table 6-2.
32
16
6/26/2020
33
34
17
6/26/2020
35
36
18
6/26/2020
60
50
40
30
20
10
1
150 175 200 225 250
Battery Life (x) in Hours
37
38
19