0% found this document useful (0 votes)
2 views

Chapter 10. Statistical Inference for Two Samples

Chapter 9 of the Probability & Statistics course focuses on statistical inference for two samples, covering topics such as the difference in means of two normal distributions with known and unknown variances, and inference on two proportions. It includes methods for constructing confidence intervals, hypothesis testing, and exercises for practical application. The chapter provides formulas and examples to illustrate the concepts discussed.

Uploaded by

syhuytran2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 10. Statistical Inference for Two Samples

Chapter 9 of the Probability & Statistics course focuses on statistical inference for two samples, covering topics such as the difference in means of two normal distributions with known and unknown variances, and inference on two proportions. It includes methods for constructing confidence intervals, hypothesis testing, and exercises for practical application. The chapter provides formulas and examples to illustrate the concepts discussed.

Uploaded by

syhuytran2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Chapter 9: Statistical Inference for Two

Samples

Course Name: PROBABILITY & STATISTICS

Lecturer: Huong Pham

Hanoi, 2023

1 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known

2 Inference on the Difference in Means of Two Normal


Distributions, Variance Unknown

3 Inference on the Two Proportions

2 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known

2 Inference on the Difference in Means of Two Normal


Distributions, Variance Unknown

3 Inference on the Two Proportions

3 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Known
We will assume that
X11 , X12 , ..., X1n1 is a random sample from population 1
X21 , X22 , ..., X2n2 is a random sample from population 2
The two populations represented by X1 and X2 are
independent
Both populations are normal
Then, the quantity

X̄1 − X̄2 − (µ1 − µ2 )


Z= q 2 ,
σ1 σ22
n1 + n2

has a N (0, 1) distribution.

4 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,
Variances Known

Confidence Interval on the Difference in Means, Variances Known


If x̄1 and x̄2 are the means of independent random samples of sizes
n1 and n2 from two independent normal populations with known
variances σ12 and σ22 , respectively, a 100(1 − α)% confidence interval
for µ1 − µ2 is
s s
σ12 σ22 σ12 σ22
x̄1 − x̄2 − zα/2 + ≤ µ1 − µ2 ≤ x̄1 − x̄2 + zα/2 +
n1 n2 n1 n2

where zα/2 is the upper α/2 percentage point of the standard normal
distribution.

5 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,
Variances Known

Example 1
A product developer is interested in reducing the drying time of a
primer paint (sơn lót). Two formulations of the paint are tested;
formulation 1 is the standard chemistry, and formulation 2 has a
new drying ingredient (thành phần) that should reduce the drying
time. From experience, it is known that the standard deviation of
drying time is 8 minutes, and this inherent variability should be
unaffected by the addition of the new ingredient. Ten specimens are
painted with formulation 1, and another 10 specimens are painted
with formulation 2; the 20 specimens are painted in random order.
The two sample average drying times are x̄1 = 121 minutes and
x̄2 = 112 minutes.

Construct 95% confidence interval on the difference in means.


6 / 38 Chapter 9: Statistical Inference for Two Samples
Confidence Interval on the Difference in Means,
Variances Known

Sample Size for a Confidence Interval on the Difference in Means,


Variances Known
If the standard deviations σ1 and σ2 are known and the two sample
sizes n1 and n2 are equal (n1 = n2 = n) we can determine the sample
size required so that the error in estimating µ1 − µ2 by x̄1 − x̄2 will
be less than E at 100(1 − α)% confidence. The required sample size
from each population is
zα/2 2 2
n=( ) (σ1 + σ22 )
E

7 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,
Variances Known

One-Sided Confidence Bounds


One-Sided Upper Confidence Bound
s
σ12 σ22
µ1 − µ2 ≤ x̄1 − x̄2 + zα +
n1 n2

One-Sided Lower Confidence Bound


s
σ12 σ22
x̄1 − x̄2 − zα + ≤ µ1 − µ2
n1 n2

8 / 38 Chapter 9: Statistical Inference for Two Samples


Hypothesis Tests on the Difference in Means,
Variances Known
Formally, we summarize these results in the following display.
Tests on the Difference in Means, Variances Known

9 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Known

Exercise 1
Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 6= µ2 with
known variances σ1 = 10 and σ2 = 5. Suppose that sample sizes n1 = 10
and n2 = 15 and that x̄1 = 4.7 and x̄2 = 7.8. Use α = 0.05
Test the hypothesis and find the P-value.

10 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Known

Exercise 1
Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 6= µ2 with
known variances σ1 = 10 and σ2 = 5. Suppose that sample sizes n1 = 10
and n2 = 15 and that x̄1 = 4.7 and x̄2 = 7.8. Use α = 0.05
Test the hypothesis and find the P-value.

Exercise 2
Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 < µ2 with
known variances σ1 = 10 and σ2 = 5. Suppose that sample sizes n1 = 10
and n2 = 15 and that x̄1 = 14.2 and x̄2 = 19.7. Use α = 0.05
Test the hypothesis and find the P-value.

10 / 38 Chapter 9: Statistical Inference for Two Samples


Question 1

11 / 38 Chapter 9: Statistical Inference for Two Samples


Question 2

12 / 38 Chapter 9: Statistical Inference for Two Samples


Question 3

13 / 38 Chapter 9: Statistical Inference for Two Samples


Question 4

14 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known

2 Inference on the Difference in Means of Two Normal


Distributions, Variance Unknown

3 Inference on the Two Proportions

15 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Unknown
Case 1: σ12 = σ22 = σ 2 .
Let
X11 , X12 , ..., X1n1 is a random sample from population 1
X21 , X22 , ..., X2n2 is a random sample from population 2
X̄1 , X̄2 , S12 and S22 be the sample means and sample variances,
respectively.

16 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Unknown and
Equal

Pooled Estimator of Variance


The pooled variance of σ 2 denoted by Sp2 , is defined by

(n1 − 1)S12 + (n2 − 1)S22


Sp2 =
n1 + n2 − 2

17 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Unknown and
Equal

Given the assumptions of this section, the quantity

X̄1 − X̄2 − (µ1 − µ2 )


T = q 2
Sp Sp2
n1 + n2

has a t distribution with n1 + n2 − 2 degrees of freedom.

18 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,
Variances Unknown and Equal

Confidence Interval on the Difference in Means, Variances


Unknowns and Equal
If x̄1 , x̄2 , s21 and s22 are the sample means and variances of two random
samples of sizes n1 and n2 respectively, from two independent normal
populations with unknown but equal variances, then a 100(1 − α)%
confidence interval on the difference in means µ1 − µ2 is
q 2
s s2
x̄1 − x̄2 − tα/2,n1 +n2 −2 np1 + np2 ≤ µ1 − µ2
q 2
s s2
≤ x̄1 − x̄2 + tα/2,n1 +n2 −2 np1 + np2

where tα/2,n1 +n2 −2 is the upper α/2 percentage point of the t


distribution with n1 + n2 − 2 degrees of freedom.

19 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in Means,
Variances Unknown and Equal

One-sided confidence bound on the difference in means


One-Sided Upper Confidence Bound
s
s2p s2p
µ1 − µ2 ≤ x̄1 − x̄2 + tα,n1 +n2 −2 +
n1 n2

One-Sided Lower Confidence Bound


s
s2p s2p
x̄1 − x̄2 − tα,n1 +n2 −2 + ≤ µ1 − µ2
n1 n2

20 / 38 Chapter 9: Statistical Inference for Two Samples


Hypotheses Tests on the Difference in Means,
Variances Unknown and Equal

Tests on the Difference in Means of Two Normal Distributions,


Variances Unknown and Equal

21 / 38 Chapter 9: Statistical Inference for Two Samples


Hypotheses Tests on the Difference in Means,
Variances Unknown and Not Assumed Equal

Case 2: σ12 6= σ22 .


Test Statistic for the Difference in Means, Variances Unknown and
Not Assumed Equal
If H0 : µ1 − µ2 = ∆0 is true, the statistic

X̄1 − X̄2 − ∆0
T0∗ = q 2
S1 S22
n1 + n2

is distributed approximately as t with degrees of freedom given by


s2 s22 2
( n11 + n2 )
v= (s21 /n1 )2 (s22 /n2 )2
n1 −1 + n2 −1

If v is not an integer, round down to the nearest integer.


22 / 38 Chapter 9: Statistical Inference for Two Samples
Confidence Interval on the Difference in Means,
Variances Unknown and Not Assumed Equal

Case 2: σ12 6= σ22 .

Approximate Confidence Interval on the Difference in Means,


Variances Unknown Are Not Assumed Equal
If x̄1 , x̄2 , s21 and s22 are the means and variances of two random samples
of sizes n1 and n2 respectively, from two independent normal
populations with unknown and unequal variances, an approximate
100(1 − α)% confidence interval on the difference in means µ1 − µ2 is
s s
s21 s22 s21 s2
x̄1 − x̄2 − tα/2,v + ≤ µ1 − µ2 ≤ x̄1 − x̄2 + tα/2,v + 2
n1 n2 n1 n2

23 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Difference in Means of Two
Normal Distributions, Variance Unknown

Exercise 1
Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 6= µ2 . Suppose
that sample sizes n1 = 15 and n2 = 15 and that x̄1 = 4.7 and x̄2 = 7.8,
s21 = 4, s22 = 6.25. Assume that σ12 = σ22 and that the data are drawn from
normal distributions. Use α = 0.05
a) Test the hypothesis.

24 / 38 Chapter 9: Statistical Inference for Two Samples


Question 1

25 / 38 Chapter 9: Statistical Inference for Two Samples


Question 2

26 / 38 Chapter 9: Statistical Inference for Two Samples


Question 3

27 / 38 Chapter 9: Statistical Inference for Two Samples


Question 4

28 / 38 Chapter 9: Statistical Inference for Two Samples


Question 5

29 / 38 Chapter 9: Statistical Inference for Two Samples


Content

1 Inference on the Difference in Means of Two Normal


Distributions, Variance Known

2 Inference on the Difference in Means of Two Normal


Distributions, Variance Unknown

3 Inference on the Two Proportions

30 / 38 Chapter 9: Statistical Inference for Two Samples


Inference on the Two Proportions

Two independent random samples of size n1 and n2 (large enough).


x1 x2
Sample proportion: p̂1 = n1 , p̂2 = n2
p̂1 − p̂2 is point estimator of p1 − p2
If n1 , n2 are large enough, we have

p1 (1 − p1 ) p2 (1 − p2 )
p̂1 − p̂2 ∼ N (p1 − p2 , + )
n1 n2
Pooled proportion
x1 + x2
p̂ =
n1 + n2

31 / 38 Chapter 9: Statistical Inference for Two Samples


Confidence Interval on the Difference in
Population Proportions

Approximate Confidence Interval on the Difference in Population


Proportions
If p̂1 and p̂2 are the sample proportions of observations in two
independent random samples of sizes n1 and n2 that belong to a class
of interest, an approximate twosided 100(1 − α)% confidence interval
on the difference in the true proportions p1 − p2 is
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
p̂1 − p̂2 − zα/2 + ≤ p1 − p2
n1 n2
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
≤ p̂1 − p̂2 + zα/2 +
n1 n2
where zα/2 is the upper α/2 percentage point of the standard normal
distribution.
32 / 38 Chapter 9: Statistical Inference for Two Samples
Large-Sample Tests on the Difference in
Population Proportions
We are interested in testing the hypotheses

H0 : p1 = p2
H1 : p1 6= p2

Test Statistic:
P̂1 − P̂2 − (p1 − p2 )
Z=q
p1 (1−p1 )
n1 + p2 (1−p
n2
2)

33 / 38 Chapter 9: Statistical Inference for Two Samples


Large-Sample Tests on the Difference in
Population Proportions

Approximate Tests on the Difference of Two Population


Proportions

34 / 38 Chapter 9: Statistical Inference for Two Samples


Question 1

35 / 38 Chapter 9: Statistical Inference for Two Samples


Question 2

36 / 38 Chapter 9: Statistical Inference for Two Samples


Question 3

37 / 38 Chapter 9: Statistical Inference for Two Samples


Question 4

38 / 38 Chapter 9: Statistical Inference for Two Samples

You might also like