0% found this document useful (0 votes)
60 views28 pages

Lecture 7 (Two Sample Tests)

This document discusses two-sample tests for comparing means between two independent populations or two related populations. It provides information on hypothesis tests to compare the difference between two population means when the variances are unknown and assumed equal or unknown and not assumed equal. An example is shown testing for a difference in dividend yields between stocks on the NYSE and NASDAQ using a pooled variance t-test. The document also discusses constructing a confidence interval for the difference between two population means.

Uploaded by

Carlene Ugay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views28 pages

Lecture 7 (Two Sample Tests)

This document discusses two-sample tests for comparing means between two independent populations or two related populations. It provides information on hypothesis tests to compare the difference between two population means when the variances are unknown and assumed equal or unknown and not assumed equal. An example is shown testing for a difference in dividend yields between stocks on the NYSE and NASDAQ using a pooled variance t-test. The document also discusses constructing a confidence interval for the difference between two population means.

Uploaded by

Carlene Ugay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Two-Sample Tests

Learning Outcomes

In this session, you learn:


• How to use hypothesis testing for comparing the difference
between
• The means of two independent populations
• The means of two related populations
Two-Sample Tests

Two-Sample Tests

Population Population
Means, Means,
Independent Related
Samples Samples
Examples:
Same group
Group 1 vs. before vs. after
Group 2 treatment
Difference Between Two Means

Population means, Goal: Test hypothesis or form


independent
samples
* a confidence interval for the
difference between two
population means, μ1 – μ2
σ1 and σ2 unknown,
assumed equal The point estimate for the
difference is

X1 – X2
σ1 and σ2 unknown,
not assumed equal
Difference Between Two Means: Independent Samples
• Different data sources
Population means, • Unrelated
independent
samples
* • Independent
• Sample selected from one population
has no effect on the sample selected
from the other population

Use Sp to estimate unknown


σ1 and σ2 unknown, σ. Use a Pooled-Variance t
assumed equal test.

σ1 and σ2 unknown, Use S1 and S2 to estimate


not assumed equal unknown σ1 and σ2. Use a
Separate-variance t test
Hypothesis Tests for Two Population Means
Two Population Means, Independent Samples

Lower-tail test: Upper-tail test: Two-tail test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2


H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and assumed equal

Population means, Assumptions:


independent
▪ Samples are randomly and
samples
independently drawn

▪ Populations are normally


σ1 and σ2 unknown,
assumed equal
* distributed or both sample
sizes are at least 30

▪ Population variances are


unknown but assumed equal
σ1 and σ2 unknown,
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and assumed equal
(continued)

• The pooled variance is:


Population means,
independent S 2
=
(n1 − 1)S1
2
+ (n2 − 1)S2
2

(n1 − 1) + (n2 − 1)
p
samples

• The test statistic is:


σ1 and σ2 unknown,
assumed equal
* ( X1 − X 2 ) − ( μ 1 − μ 2 )
t STAT =
2  1 1 

Sp  + 
 n1 n2 
σ1 and σ2 unknown,
not assumed equal • Where tSTAT has d.f. = (n1 + n2 – 2)
Confidence interval for µ1 - µ2 with σ1 and σ2 unknown and assumed equal

Population means,
independent
samples
The confidence interval for
μ1 – μ2 is:
σ1 and σ2 unknown,
assumed equal
*
( X1 − X 2 )  tα/2 2 
Sp 
1
+
1 


 n1 n 2 

σ1 and σ2 unknown, Where tα/2 has d.f. = n1 + n2 – 2


not assumed equal
Pooled-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are approximately normal with


equal variances, is there a difference in mean
yield (a = 0.05)?
Pooled-Variance t Test Example: Calculating the Test Statistic
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) (continued)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

The test statistic is:

t=
(X1 − X 2 ) − (μ1 − μ 2 )
=
(3.27 − 2.53) − 0 = 2.040
2  1 1 

 1
1.5021 + 
1 
Sp  + 
 n1 n 2   21 25 

S =
2 (n1 − 1)S1
2
+ (n 2 − 1)S 2
2
=
(21 − 1)1.30 2 + (25 − 1)1.16 2
= 1.5021
(n1 − 1) + (n2 − 1) (21 - 1) + (25 − 1)
p
Pooled-Variance t Test Example: Hypothesis Test Solution

H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)


a = 0.05 .025 .025
df = 21 + 25 - 2 = 44
Critical Values: t = ± 2.0154
-2.0154 0 2.0154 t
2.040
Test Statistic: Decision:
3.27 − 2.53 Reject H0 at a = 0.05
t= = 2.040
 1 1 
1.5021  +  Conclusion:
 21 25  There is evidence of a
difference in means.
Pooled-Variance t Test Example: Confidence Interval for µ1 - µ2

Since we rejected H0 can we be 95% confident that µNYSE >


µNASDAQ?

95% Confidence Interval for µNYSE - µNASDAQ

(X − X )  t
1 2 a/2
2
p
1 1 
S  +  = 0.74  2.0154  0.3628 = (0.009, 1.471)

 n1 n 2 

Since 0 is less than the entire interval, we can be 95%


confident that µNYSE > µNASDAQ
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown, not
assumed equal

Population means, Assumptions:


independent
▪ Samples are randomly and
samples
independently drawn

σ1 and σ2 unknown, ▪ Populations are normally


assumed equal distributed or both sample
sizes are at least 30

▪ Population variances are


σ1 and σ2 unknown, unknown and cannot be
not assumed equal * assumed to be equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and not assumed
equal (continued)

The test statistic is:


Population means,
independent
t STAT =
( X 1 )
− X 2 − ( μ1 − μ 2 )
samples
S12 S 22
+
n1 n 2
σ1 and σ2 unknown,
assumed equal tSTAT has d.f. ν =
2
 S1 2 S
2

 
 n + n 
2

 =  1 2 
2 2
 S1 2   S22 
σ1 and σ2 unknown,
not assumed equal
* 
 n 

 1  + 
n1 − 1

 n 

2 

n2 − 1
Separate-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are approximately normal


with unequal variances, is there a difference in mean
yield (a = 0.05)?
Separate-Variance t Test Example: Calculating the Test Statistic
(continued)
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

The test statistic is:

t=
( )
X1 − X 2 − (μ1 − μ 2 )
=
(3.27 − 2.53) − 0 = 2.019
 S12 S 22   1.302 1.162 
 +   + 
 n1 n 2   21 25 

2
 S1 2 S 2 2 
2
   1.30 2
1.16 2

n +  
 + 
n2  Use degrees of
=  2  21 25 
= = 40.57
1

 S1 
2
 S2 2 2 2 2
 1.30   1.16  2 2 freedom = 40
       
n  n   21  +  25 
 1  + 2 
n1 − 1 n2 − 1 20 24
Separate-Variance t Test Example: Hypothesis Test Solution

H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)


a = 0.05 .025 .025
df = 40
Critical Values: t = ± 2.021
-2.021 0 2.021 t
2.019
Test Statistic: Decision:
Fail To Reject H0 at a = 0.05

t = 2.019 Conclusion:
There is insufficient evidence of
a difference in means.
Related Populations: The Paired Difference Test

Tests Means of 2 Related Populations


Related • Paired or matched samples
• Repeated measures (before/after)
samples
• Use difference between paired values:

Di = X1i - X2i
• Eliminates Variation Among Subjects
• Assumptions:
• Both Populations Are Normally Distributed
• Or, if not Normal, use large samples
Related Populations (continued)
The Paired Difference Test
The ith paired difference is Di , where
Related Di = X1i - X2i
samples
n
The point estimate for the
paired difference D i
D= i =1
population mean μD is D : n

n
The sample standard  (D − D)i
2

deviation is SD SD = i=1
n −1
n is the number of pairs in the paired sample
The Paired Difference Test: Finding tSTAT

• The test statistic for μD is:


Paired
samples
D − μD
t STAT =
SD
n

◼ Where tSTAT has n - 1 d.f.


The Paired Difference Test: Possible Hypotheses
Paired Samples

Lower-tail test: Upper-tail test: Two-tail test:

H0: μD  0 H0: μD ≤ 0 H0: μD = 0


H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Where tSTAT has n - 1 d.f.
The Paired Difference Confidence Interval

The confidence interval for μD is


Paired
samples
SD
D  ta / 2
n
n

 (D − D)
i
2

where SD = i=1
n −1
Paired Difference Test: Example

• Assume you send your salespeople to a “customer


service” training workshop. Has the training made a
difference in the number of complaints? You collect the
following data:

Number of Complaints: (2) - (1)  Di


Salesperson Before (1) After (2) Difference, Di D = n
C.B. 6 4 - 2 = -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K. 0 0 0
SD =
 (D − D)
i
2

M.O. 4 0 - 4 n −1
-21
= 5.67
Paired Difference Test: Solution
• Has the training made a difference in the number of complaints (at
the 0.01 level)?

Reject Reject
H0: μD = 0
H1: μD  0
a/2 a/2

a = .01 D = - 4.2 - 4.604 4.604


- 1.66
t0.005 = ± 4.604
d.f. = n - 1 = 4 Decision: Do not reject H0
(tstat is not in the reject region)
Test Statistic:
D − μ D − 4.2 − 0 Conclusion: There is not a
t STAT = = = −1.66 significant change in the number of
SD / n 5.67/ 5 complaints.
The Paired Difference Confidence Interval -- Example

SD
D  ta / 2
The confidence interval for μD is:

n
D = -4.2, SD = 5.67
Since this interval contains 0 cannot be 99% confident that μD doesn’t = 0

5.67
99% CI for  D : − 4.2  4.604
5
= (-15.87, 7.47)
Session Summary
In this session we discussed
• Comparing two independent samples
• Performed pooled-variance t test for the difference in two means
• Performed separate-variance t test for difference in two means
• Formed confidence intervals for the difference between two means
• Comparing two related samples (paired samples)
• Performed paired t test for the mean difference
• Formed confidence interval for the mean difference

You might also like