0% found this document useful (0 votes)
12 views35 pages

3 Analyze Hypothesis Testing Normal Data

The document discusses the evaluation of three suppliers based on their quality and pricing, with a focus on using ANOVA to determine if there are significant differences in the means of their products. It outlines the steps for testing normality and equal variances, as well as the results of the ANOVA analysis, which indicate no significant difference between the suppliers. The document emphasizes the importance of ensuring quality while considering cost in supplier selection.

Uploaded by

yarno.prc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views35 pages

3 Analyze Hypothesis Testing Normal Data

The document discusses the evaluation of three suppliers based on their quality and pricing, with a focus on using ANOVA to determine if there are significant differences in the means of their products. It outlines the steps for testing normality and equal variances, as well as the results of the ANOVA analysis, which indicate no significant difference between the suppliers. The document emphasizes the importance of ensuring quality while considering cost in supplier selection.

Uploaded by

yarno.prc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Three Samples

We have three potential suppliers that claim to have


equal levels of quality. Supplier B provides a
considerably lower purchase price than either of the
other two vendors. We would like to choose the
lowest cost supplier but we must ensure that we do
not affect the quality of our raw material.

Supplier A Supplier B Supplier


C
3.16 4.24 4.58
4.35 3.87 4.00
3.46 3.87 4.24
3.74 4.12 3.87
3.61 3.74 3.46
We would like test the data to determine
whether there is a difference between the
three suppliers.

1
Purpose of ANOVA

Analysis of Variance (ANOVA) is used to investigate and


model the relationship between a response variable
and one or more independent variables.

Analysis of variance extends the two sample t-test for


testing the equality of two population Means to a more
general null hypothesis of comparing the equality of
more than two Means, versus them not all being equal.
– The classification variable, or factor, usually has
three or more levels (If there are only two levels,
a t-test can be used).
– Allows you to examine differences among means
using multiple comparisons.
– The ANOVA test statistic is:
Avg SS between S2 between
 2
Avg SS within S within
2
Follow the Roadmap…Test for Normality

Probability Plot of Supplier A


First you have to verify
99
Normal
normality
Mean 3.664

95
StDev
N
0.4401
5 All three suppliers samples
AD 0.246
90
P-Value 0.568 are Normally Distributed.
80
70 Supplier AP-value
Percent

60
50 0.568
40
30 Probability Plot of Supplier B Supplier BP-value
20 Normal
0.385
99
10 Mean 3.968 Supplier C P-
StDev 0.2051
value 0.910
5
95 N 5
AD 0.314
1
90 Probability
P-Value 0.385 Plot of Supplier C
2.5 3.0 80 3.5 4.0 4.5 Normal
70
Supplier A 99
Mean 4.03
Percent

60
StDev 0.4177
50
95 N 5
40
AD 0.148
30 90
P-Value 0.910
20
80
10 70
Percent
60
5
50
40
1 30
3.50 3.75 4.00 20 4.25 4.50
Supplier B
10

1
3.0 3.5 4.0 4.5 5.0
Supplier C

3
Tests of Variance

Verify ANOVA basic assumption: Equal Variances

Tests of Variance are used for both Normal and Non-


normal Data.

Normal Data
– 2 Samples – F-Test
– 3 or More Samples Bartlett’s Test

Non-Normal Data
– 2 or more samples Levene’s Test

The null hypothesis states there is no difference between the


Standard Deviations or variances.
– Ho: σ1 = σ2 = σ3 …
– Ha = at least on is different

4
Test for Equal Variance…

Test for equal variance (must stack data first):

Test
Testfor
forEqual
EqualVariances
Variancesfor
forData
Data
Bartlett's
Bartlett'sTest
Test
Test Statistic
Test Statistic 2.11
2.11
Supplier
SupplierAA P-Value
P-Value 0.348
0.348
Levene's
Levene'sTest
Test
Test
TestStatistic
Statistic 0.59
0.59
P-Value
P-Value 0.568
0.568
Suppliers
Suppliers

Supplier
SupplierBB

Supplier
SupplierCC

0.0
0.0 0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8
95% Bonferroni Confidence Intervals for StDevs
95% Bonferroni Confidence Intervals for StDevs

5
ANOVA Minitab

Stat>ANOVA>One-Way Unstacked

6
ANOVA

What does this graph tell us? Which supplier would


you select?

Boxplot
Boxplotof
ofSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierCC
4.6
4.6

4.4
4.4

4.2
4.2

4.0
4.0
Data
Data

3.8
3.8

3.6
3.6

3.4
3.4

3.2
3.2

3.0
3.0
Supplier
SupplierAA Supplier
SupplierBB Supplier
SupplierCC

7
ANOVA Session Window

Normal data P-value >


Stat>ANOVA>One Way .05 No Difference

Test for Equal Variances: Suppliers vs ID


One-way ANOVA: Suppliers versus ID
Analysis of Variance for Supplier
Source DF SS MS F P
ID 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137
Total 14 2.025
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----------+---------+---------+------
Supplier 5 3.6640 0.4401 (-----------*-----------)
Supplier 5 3.9680 0.2051 (-----------*-----------)
Supplier 5 4.0300 0.4177 (-----------*-----------)
----------+---------+---------+------
Pooled StDev = 0.3698 3.60 3.90 4.20

8
ANOVA

P value indicates there is no


Test for Equal Variances: Suppliers vs ID
significant difference
One-way ANOVA: Suppliers versus ID
Analysis of Variance for Supplier
between suppliers. What
Source DF SS MS F P
significant difference could
ID 2 0.384 0.192 1.40 0.284 have been detected with a
Error 12 1.641 0.137 sample of 5?
Total 14 2.025
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----------+---------+---------+------
Supplier 5 3.6640 0.4401 (-----------*-----------)
Supplier 5 3.9680 0.2051 (-----------*-----------)
Supplier 5 4.0300 0.4177 (-----------*-----------)
----------+---------+---------+------
Pooled StDev = 0.3698 3.60 3.90 4.20

9
Sample Size

Let’s check and how much difference we can see with


a sample of 5.

10
ANOVA Assumptions

For ANOVA to be valid, the following assumptions must stand


1. Errors are normally and independently distributed.
2. Homogeneity of variance among factor levels. (done)

Model adequacy should be checked by:


1. Checking the data for Normality at each level and for
homogeneity of variance across all levels. (done)
2. Examine the residuals (a residual is the difference in
what the model predicts and the true observation).
1. Normal plot of the residuals
2. Residuals versus fits
3. Residuals versus order

f the model is adequate, the residual plots will be structureless.


11
Residual Plots

Stat>ANOVA>One-Way Unstacked>Graphs

12
Normal Probability Plot of Residuals

Normality plot of the residuals should follow a straight line.


Results of our example look good.
The Normality assumption is satisfied.

Normal
NormalProbability
ProbabilityPlot
Plotof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A, Supplier
SupplierB,
B,Supplier
SupplierC)
C)
99
99

95
95
90
90

80
80
70
70
Percent
Percent

60
60
50
50
40
40
30
30
20
20

10
10
55

11
-1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0
Residual
Residual

13
Residuals versus Fitted Values

The plot of residuals versus fits examines constant variance.


The plot should be structureless with no outliers present.
Our example does not indicate a problem.

Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
0.75
0.75

0.50
0.50

0.25
0.25
Residual
Residual

0.00
0.00

-0.25
-0.25

-0.50
-0.50

3.65
3.65 3.70
3.70 3.75
3.75 3.80
3.80 3.85
3.85 3.90
3.90 3.95
3.95 4.00
4.00 4.05
4.05
FF
itted Value
itted Value

14
ANOVA Exercise

Practice

1. The quality manager was challenged by the plant director as to


why the VOC levels in the product varied so much. The quality
manager now wants to find if the product quality is different
because of how the shifts work with the product.

2. The quality manager wants to know if the average is different for


the ppm VOC of the product among the production shifts.

3. Use Data in columns “ppm VOC” and “Shift” to determine the


answer for the quality manager at a 95% confidence level.

15
ANOVA Exercise: Solution

We want to see if the 3


samples are from
Normal populations.

In “Variables:” enter
‘ppm VOC’

In “By variables:”
enter ‘Shift’

16
ANOVA Exercise: Solution

The P-value is greater than Summary


Summaryfor
forppm
ppmVOC
Shift
VOC
Shift==11
P-Value 0.446

0.05 for both Anderson-Darling Anderson-Darling Normality Test


Anderson-Darling Normality Test
A-Squared
A-Squared
P-Value
0.32
0.32
0.446

Normality Tests so we
P-Value 0.446
Mean 39.500
Mean 39.500
StDev 6.761
StDev 6.761
Variance 45.714

conclude the samples are from


Variance 45.714
Skewness 0.58976
Skewness 0.58976
Kurtosis -1.13911
Kurtosis -1.13911
N 8

Normally Distributed
N 8
Minimum 32.000
Minimum 32.000
1st Quartile 33.500
1st Quartile 33.500
Median 38.000

populations because we “failed


Median 38.000
20 25 30 35 40 45 50 3rd Q uartile 46.500
20 25 30 35 40 45 50 3rd Q uartile 46.500
Maximum 50.000
Maximum 50.000
95% Confidence I nterval for Mean
95% Confidence I nterval for Mean

to reject” the null hypothesis 33.847


33.847
45.153
45.153
95% Confidence I nterval for Median
95% Confidence I nterval for Median
32.936 48.129

that the data sets are from


95% Confidence I ntervals 32.936 48.129
95% Confidence I ntervals 95% Confidence I nterval for StDev
95% Confidence I nterval for StDev
Mean 4.470 13.761
Mean 4.470 13.761

Normal Distributions. Median


Median

35
35
40
40
45
45
50
50

Summary Summary
Summaryfor
forppm
ppmVOC P-Value 0.658
Summaryfor
forppm
ppmVOC
VOC P-Value 0.334 VOC
Shift
Shift
Shift==22 Shift==33
Anderson-Darling Normality Test Anderson-Darling Normality Test
Anderson-Darling Normality Test Anderson-Darling Normality Test
A-Squared 0.37 A-Squared 0.24
A-Squared 0.37 A-Squared 0.24
P-Value 0.334 P-Value 0.658
P-Value 0.334 P-Value 0.658
Mean 34.625 Mean 28.000
Mean 34.625 Mean 28.000
StDev 5.041 StDev 6.525
StDev 5.041 StDev 6.525
Variance 25.411 Variance 42.571
Variance 25.411 Variance 42.571
Skewness -0.74123 Skewness 0.06172
Skewness -0.74123 Skewness 0.06172
Kurtosis 1.37039 Kurtosis -1.10012
Kurtosis 1.37039 Kurtosis -1.10012
N 8 N 8
N 8 N 8
Minimum 25.000 Minimum 19.000
Minimum 25.000 Minimum 19.000
1st Quartile 31.750 1st Q uartile 22.000
1st Quartile 31.750 1st Q uartile 22.000
Median 35.500 Median 28.000
Median 35.500 Median 28.000
20 25 30 35 40 45 50 3rd Q uartile 37.000 20 25 30 35 40 45 50 3rd Q uartile 32.750
20 25 30 35 40 45 50 3rd Q uartile 37.000 20 25 30 35 40 45 50 3rd Q uartile 32.750
Maximum 42.000 Maximum 38.000
Maximum 42.000 Maximum 38.000
95% Confidence I nterval for Mean 95% Confidence I nterval for Mean
95% Confidence I nterval for Mean 95% Confidence I nterval for Mean
30.411 38.839 22.545 33.455
30.411 38.839 22.545 33.455
95% Confidence I nterval for Median 95% Confidence I nterval for Median
95% Confidence I nterval for Median 95% Confidence I nterval for Median
30.614 37.322 20.871 33.322
95% Confidence I ntervals 30.614 37.322 95% Confidence I ntervals 20.871 33.322
95% Confidence I ntervals 95% Confidence I nterval for StDev 95% Confidence I ntervals 95% Confidence I nterval for StDev
95% Confidence I nterval for StDev 95% Confidence I nterval for StDev
Mean 3.333 10.260 Mean 4.314 13.279
Mean 3.333 10.260 Mean 4.314 13.279

Median Median
Median Median

30 32 34 36 38 40 20.0 22.5 25.0 27.5 30.0 32.5 35.0


30 32 34 36 38 40 20.0 22.5 25.0 27.5 30.0 32.5 35.0

17
ANOVA Exercise: Solution

First we need to determine if


our data has equal variances.

Stat > ANOVA > Test for Equal Variances…

Now we need to test the


variances.

For “Response:” enter ‘ppm


VOC’

For “Factors:” enter ‘Shift’


18
ANOVA Exercise: Solution

The P-value of the F-test was greater than 0.05 so we


“fail to reject” the null hypothesis.

Test
Testfor
forEqual
EqualVariances
Variancesfor
forppm
ppmVOC
VOC
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 0.63
0.63
11 P-Value
P-Value 0.729
0.729
Levene's
Levene'sTest
Test
Test
TestStatistic
Statistic 0.85
0.85
P-Value
P-Value 0.440
0.440
Shift
Shift

22

33

22 44 66 88 10
10 12
12 14
14 16
16 18
18
95%
95% Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs

Are the variances are equal…Yes!


19
ANOVA Exercise: Solution

We must look at the residual plots to be sure our


ANOVA analysis is valid.
Since our residuals look Normally Distributed and
randomly patterned, we will assume our analysis is
correct.
Residual
ResidualPlots
Plotsfor
forppm
ppmVOC
VOC
Normal
NormalProbability
ProbabilityPlot
Plot Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
99
99 NN 24
24 10
AD 0.255
10
90 AD 0.255
90 P-Value
P-Value 0.698
0.698 55

Residual
Percent

Residual
Percent

50
50 00

10 -5
-5
10
11 -10
-10
-10
-10 00 10
10 30
30 35
35 40
40
Residual
Residual Fitted
FittedValue
Value

Histogram
Histogramof
ofthe
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
4.8 10
10
4.8

3.6 55
Frequency

3.6
Frequency

Residual
Residual

2.4 00
2.4

1.2 -5
-5
1.2

0.0 -10
-10
0.0
-10
-10 -5
-5 00 55 10
10 22 44 66 88 10
10 12
12 14
14 16
16 18
18 20
20 22
22 24
24
Residual
Residual Observation Order
Observation Order

20
ANOVA Exercise: Solution

Since the P-value of the ANOVA test is less than 0.05, we


“reject” the null hypothesis that the Mean product quality
as measured in ppm VOC is the same from all shifts.
We “accept” the alternate hypothesis that the Mean
product quality is different from at least one shift.

Since the confidence


intervals of the Means do
not overlap between Shift 1
and Shift 3, we see one of
the shifts is delivering a
product quality with a
higher level of ppm VOC.

21
1-Sample Variance

A 1-sample variance test is used to compare an


expected population variance to a target.
Stat > Basic Statistics > Graphical Summary

If the target variance lies inside the confidence


interval, fail to reject the null hypothesis.
– Ho: σ2Sample = σ2Target
– Ha: σ2Sample ≠ σ2Target

Use the sample size calculations for a 1 sample t-test


since they are rarely performed without performing a 1
sample t-test as well.

22
1-Sample Variance

1. Practical Problem: (not very common, although


practical when analyzing process capability
improvement)
• We are considering changing supplies for a part that
we currently purchase from a supplier that charges a
premium for the hardening process and has a large
variance in their process.
• The proposed new supplier has provided us with a
sample of their product. They have stated they can
maintain a variance of 0.10.

2. Statistical Problem:
Ho: σ2 = 0.10or Ho: σ = 0.31
Ha: σ2 ≠ 0.10 Ha: σ ≠ 0.31

23
Test of Variance Example

1. Practical Problem:
We want to determine the effect of two different storage
methods on the rotting of potatoes. You study conditions
conducive to potato rot by injecting potatoes with bacteria
that cause rotting and subjecting them to different
temperature and oxygen regimes. We can test the data
to determine if there is a difference in the Standard
Deviation of the rot time between the two different
methods.

2. Statistical Problem:
H o: σ 1 = σ 2
Ha: σ1 ≠ σ2

3. Equal variance test (F-test since there are only 2 factors.)

24
Normality Test

Perform another test using the column Rot.

Probability
ProbabilityPlot
Plotof
ofRot
Rot
Normal
Normal
99
99
Mean
Mean 13.78
13.78
StDev
StDev 7.712
7.712
95
95 NN 18
18
AD
AD 0.285
0.285
90
90 P-Value
P-Value 0.586
0.586
80
80
70
70
Percent
Percent

60
60
50
50
40
40 The P-value is > 0.05
30
30 We can assume our
20
20
data is Normally
10
10 Distributed.
55

11
-5
-5 00 55 10
10 15
15 20
20 25
25 30
30 35
35
Rot
Rot

25
Test For Equal Variances

Stat>ANOVA>Test for Equal Variance

26
Test For Equal Variances Graphical Analysis

Test
Testfor
forEqual
Equal Variances
Variancesfor
forRot
Rot
Temp
Temp Oxygen
Oxygen
Bartlett's
Bartlett'sTest
Test
22 Test Statistic
Test Statistic 2.71
2.71
P-Value
P-Value 0.744
0.744
10 66 Levene's
Levene'sTest
Test
10
Test
TestStatistic
Statistic 0.37
0.37
P-Value
P-Value 0.858
0.858
10
10

22

16
16 66

10
10

00 20
20 40
40 60
60 80
80 100
100 120120 140140
95% Bonferroni Confidence Intervals for StDevs
95% Bonferroni Confidence Intervals for StDevs

P-value > 0.05 shows insignificant


difference between variance

27
Test For Equal Variances Statistical Analysis

Test for Equal Variances: Rot versus Temp, Oxygen

95% Bonferroni confidence intervals for standard deviations

Temp Oxygen N Lower StDev `Upper


10 2 3 2.26029 5.29150 81.890
10 6 3 1.28146 3.00000 46.427
10 10 3 2.80104 6.55744 101.481
16 2 3 1.54013 3.60555 55.799
16 6 3 1.50012 3.51188 54.349
16 10 3 3.55677 8.32666 128.862

Bartlett's Test (normal distribution)


Use this if
Test statistic = 2.71 P-value = 0.744 data is Normal

Levene's Test (any continuous distribution)


Use this if
Test statistic = 0.37, P-value = 0.858 data is Non-normal

28
1. The quality manager was challenged by the plant director as to why the
VOC levels in the product varied so much. After using a Process Map,
some potential sources of variation were identified. These sources
included operating shifts and raw material supplier. Of course, the
quality manager has already clarified the Gage R&R results were less
than 17% study variation so the gage was acceptable.

2. The quality manager decided to investigate the effect of the raw


material supplier. He wants to see if the variation of the product
quality is different when using supplier A than supplier B. He wants
to be 95% confident the variances are similar when using the two
suppliers.

3. Use data ppm VOC and RM Supplier to determine if there is a difference


between suppliers.

29
Tests for Variance Exercise: Solution

First we want to do a graphical summary of the two


samples from the 2 suppliers.

30
Tests for Variance Exercise: Solution

In “Variables:” enter
‘ppm VOC’

In “By variables:”
enter ‘RM Supplier’

We want to see if the 2


samples are from
Normal populations.

31
Tests for Variance Exercise: Solution

The P-value is greater than 0.05 for both Anderson-Darling


Normality Tests so we conclude the samples are from Normally
Distributed populations because we “failed to reject” the null
hypothesis that the data sets are from Normal Distributions.

Summary for ppm VOC Summary for ppm VOC


RM Supplier = A RM Supplier = B
Anderson-Darling Normality Test Anderson-Darling Normality Test
A-Squared 0.33 A-Squared 0.49
P-Value 0.465 P-Value 0.175

Mean 37.583 Mean 30.500


StDev 7.090 StDev 6.571
Variance 50.265 Variance 43.182
Skewness 0.261735 Skewness -0.555911
Kurtosis -0.091503 Kurtosis -0.988688
N 12 N 12

Minimum 25.000 Minimum 19.000


1st Q uartile 33.250 1st Quartile 25.000
Median 35.500 Median 31.500
20 25 30 35 40 45 50 3rd Q uartile 42.000 20 25 30 35 40 45 50 3rd Quartile 37.000
Maximum 50.000 Maximum 38.000
95% Confidence I nterval for Mean 95% Confidence I nterval for Mean
33.079 42.088 26.325 34.675
95% Confidence I nterval for Median 95% Confidence I nterval for Median
33.263 42.000 25.000 37.000
95% Confidence I ntervals 95% Confidence I ntervals
95% Confidence I nterval for StDev 95% Confidence I nterval for StDev
Mean 5.022 12.038 Mean 4.655 11.157

Median Median

32 34 36 38 40 42 25.0 27.5 30.0 32.5 35.0 37.5

Are both Data Sets Normal?

32
Tests for Variance Exercise: Solution

33
Tests for Variance Exercise: Solution

For “Response:” enter ‘ppm VOC’


For “Factors:” enter ‘RM Supplier’
Note MINITABTM defaults to 95% confidence interval which
is exactly the level we want to test for this problem.

34
Tests for Variance Exercise: Solution

Because the 2 populations were considered to be Normally Distributed, the F-test is


used to evaluate whether the variances (Standard Deviation squared) are equal.

The P-value of the F-test was greater than 0.05 so we “fail to reject” the null hypothesis.

So once again in English: The variances are equal between the results from the two
suppliers on our product’s ppm VOC level.

Test
Testfor
forEqual
EqualVariances
Variancesfor
forppm
ppmVOC
VOC
F-Test
F-Test
Test
TestStatistic
Statistic 1.16
1.16
AA P-Value
P-Value 0.806
0.806
Supplier
RMSupplier

Levene's
Levene'sTest
Test
Test
TestStatistic
Statistic 0.02
0.02
P-Value 0.890
RM

P-Value 0.890
BB

44 66 88 10
10 12
12 14
14
95%
95% Bonferroni
BonferroniConfidence
ConfidenceIIntervals
ntervalsfor
forStDevs
StDevs

AA
Supplier
RMSupplier
RM

BB

20
20 25
25 30
30 35
35 40
40 45
45 50
50
ppm
ppmVOC
VOC

35

You might also like