0% found this document useful (0 votes)
3 views

Lecture 28 30

This lesson focuses on hypothesis testing for a population mean, covering three scenarios: Z-test with known population variance, T-test with unknown variance, and paired T-test for dependent populations. It provides examples and methodologies for conducting these tests, including critical region and p-value approaches. Additionally, it explains how to perform these tests using Minitab software.

Uploaded by

shahriaraziz2406
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 28 30

This lesson focuses on hypothesis testing for a population mean, covering three scenarios: Z-test with known population variance, T-test with unknown variance, and paired T-test for dependent populations. It provides examples and methodologies for conducting these tests, including critical region and p-value approaches. Additionally, it explains how to perform these tests using Minitab software.

Uploaded by

shahriaraziz2406
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

7/22/23, 1:36 PM Lesson 10: Tests About One Mean

Lesson 10: Tests About One Mean


Lesson 10: Tests About One Mean

Overview
In this lesson, we'll continue our investigation of hypothesis testing. In this case, we'll focus our
attention on a hypothesis test for a population mean for three situations:

a hypothesis test based on the normal distribution for the mean for the completely
unrealistic situation that the population variance is known
a hypothesis test based on the -distribution for the mean for the (much more) realistic
situation that the population variance is unknown
a hypothesis test based on the -distribution for , the mean difference in the responses of
two dependent populations

10.1 - Z-Test: When Population Variance is Known

10.1 - Z-Test: When Population Variance is Known

Let's start by acknowledging that it is completely unrealistic to think that we'd find ourselves in the
situation of knowing the population variance, but not the population mean. Therefore, the
hypothesis testing method that we learn on this page has limited practical use. We study it only
because we'll use it later to learn about the "power" of a hypothesis test (by learning how to
calculate Type II error rates). As usual, let's start with an example.

Example 10-1

Boys of a certain age are known to have a mean weight of pounds. A complaint is made that
the boys living in a municipal children's home are underfed. As one bit of evidence, boys (of
the same age) are weighed and found to have a mean weight of = 80.94 pounds. It is known that
the population standard deviation is 11.6 pounds (the unrealistic part of this example!). Based on
the available data, what should be concluded concerning the complaint?

https://online.stat.psu.edu/stat415/book/export/html/827 1/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

Answer

The null hypothesis is , and the alternative hypothesis is . In general, we


know that if the weights are normally distributed, then:

follows the standard normal distribution. It is actually a bit irrelevant here whether or not
the weights are normally distributed, because the same size is large enough for the Central
Limit Theorem to apply. In that case, we know that , as defined above, follows at least
approximately the standard normal distribution. At any rate, it seems reasonable to use the test
statistic:

for testing the null hypothesis

against any of the possible alternative hypotheses , , and .

For the example in hand, the value of the test statistic is:

The critical region approach tells us to reject the null hypothesis at the level if
. Therefore, we reject the null hypothesis because , and therefore
falls in the rejection region:

Z
-1.645
-1.75

As always, we draw the same conclusion by using the -value approach. Recall that the -value
approach tells us to reject the null hypothesis at the level if the -value . In
this case, the -value is :

https://online.stat.psu.edu/stat415/book/export/html/827 2/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

0.0401

Z
-1.75

As expected, we reject the null hypothesis because the -value .

By the way, we'll learn how to ask Minitab to conduct the -test for a mean in a bit, but this is
what the Minitab output for this example looks like this:

Test of mu = 85 vs < 85
The assumed standard deviation = 11.6
95% Upper
N Mean SE Mean Bound Z P

25 80.9400 2.3200 84.7561 -1.75 0.040

10.2 - T-Test: When Population Variance is Unknown

10.2 - T-Test: When Population Variance is Unknown

Now that, for purely pedagogical reasons, we have the unrealistic situation (of a known population
variance) behind us, let's turn our attention to the realistic situation in which both the population
mean and population variance are unknown.

Example 10-2

https://online.stat.psu.edu/stat415/book/export/html/827 3/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

It is assumed that the mean systolic blood pressure is = 120 mm Hg. In the Honolulu Heart Study,
a sample of people had an average systolic blood pressure of 130.1 mm Hg with a
standard deviation of 21.21 mm Hg. Is the group significantly different (with respect to systolic
blood pressure!) from the regular population?

Answer

The null hypothesis is , and because there is no specific direction implied, the
alternative hypothesis is . In general, we know that if the data are normally distributed,
then:

follows a -distribution with degrees of freedom. Therefore, it seems reasonable to use the
test statistic:

for testing the null hypothesis against any of the possible alternative hypotheses
, , and . For the example in hand, the value of the test
statistic is:

The critical region approach tells us to reject the null hypothesis at the level if
or if . Therefore, we reject the null hypothesis
because , and therefore falls in the rejection region:

-1.9842 1.9842
4.762

Again, as always, we draw the same conclusion by using the -value approach. The -value approach
tells us to reject the null hypothesis at the level if the -value . In this case, the
-value is :

https://online.stat.psu.edu/stat415/book/export/html/827 4/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

4.762 4.762

As expected, we reject the null hypothesis because -value .

Again, we'll learn how to ask Minitab to conduct the t-test for a mean in a bit, but this is what the
Minitab output for this example looks like:

Test of mu = 120 vs not = 120


N Mean StDev SE Mean 95% CI T P

100 130.100 21.210 2.121 (125.891, 134.309) 4.76 0.000

By the way, the decision to reject the null hypothesis is consistent with the one you would make
using a 95% confidence interval. Using the data, a 95% confidence interval for the mean is:

which simplifies to . That is, we can be 95% confident that the mean systolic blood
pressure of the Honolulu population is between 125.89 and 134.31 mm Hg. How can a population
living in a climate with consistently sunny 80 degree days have elevated blood pressure?!

Anyway, the critical region approach for the hypothesis test tells us to reject the null
hypothesis that :

if or if

which is equivalent to rejecting:

if or if

which is equivalent to rejecting:

if or if

which, upon inserting the data for this particular example, is equivalent to rejecting:

https://online.stat.psu.edu/stat415/book/export/html/827 5/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

if or if

which just happen to be (!) the endpoints of the 95% confidence interval for the mean. Indeed, the
results are consistent!

10.3 - Paired T-Test

10.3 - Paired T-Test

In the next lesson, we'll learn how to compare the means of two independent populations, but there
may be occasions in which we are interested in comparing the means of two dependent populations.
For example, suppose a researcher is interested in determining whether the mean IQ of the
population of first-born twins differs from the mean IQ of the population of second-born twins. She
identifies a random sample of pairs of twins, and measures , the IQ of the first-born twin, and ,
the IQ of the second-born twin. In that case, she's interested in determining whether:

or equivalently if:

Now, the population of first-born twins is not independent of the population of second-born twins.
Since all of our distributional theory requires the independence of measurements, we're rather stuck.
There's a way out though... we can "remove" the dependence between and by subtracting the
two measurements and for each pair of twins , that is, by considering the independent
measurements

Then, our null hypothesis involves just a single mean, which we'll denote , the mean of the
differences:

and then our hard work is done! We can just use the -test for a mean for conducting the hypothesis
test... it's just that, in this situation, our measurements are differences whose mean is and
standard deviation is . That is, when testing the null hypothesis against any of the
alternative hypotheses , , and , we compare the test
statistic:

to a -distribution with degrees of freedom. Let's take a look at an example!

https://online.stat.psu.edu/stat415/book/export/html/827 6/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

Example 10-3

Blood samples from = 10 people were sent to each of two laboratories (Lab 1 and Lab 2) for
cholesterol determinations. The resulting data are summarized here:

Subject Lab 1 Lab 2 Diff

1 296 318 -22

2 268 287 -19


. . . .

. . . .
. . . .

10 262 285 -23

Is there a statistically significant difference at the level, say, in the (population) mean
cholesterol levels reported by Lab 1 and Lab 2?

Answer

The null hypothesis is , and the alternative hypothesis is . The value of the
test statistic is:

The critical region approach tells us to reject the null hypothesis at the level if
or if . Therefore, we reject the null hypothesis because
, and therefore falls in the rejection region.

Again, we draw the same conclusion when using the -value approach. In this case, the -value is:

https://online.stat.psu.edu/stat415/book/export/html/827 7/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

As expected, we reject the null hypothesis because -value .

And, the Minitab output for this example looks like this:

Test of mu = 0 vs not = 0
N Mean StDev SE Mean 95% CI T P
10 -14.4000 6.7700 2.1409 (-19.2430, -9.5570) -6.73 0.000

10.4 - Using Minitab

10.4 - Using Minitab

Z-Test for a Single Mean


To illustrate how to tell Minitab to perform a Z-test for a single mean, let's refer to the boys weight
example that appeared on the page called The Z-test: When Population Variance is Known.

1. Under the Stat menu, select Basic Statistics, and then 1-Sample Z...:

2. In the pop-up window that appears, click on the radio button labeled Summarized data. In
the box labeled Sample size, type in the sample size n, and in the box labeled Mean, type in
the sample mean. In the box labeled Standard deviation, type in the value of the known (or
rather assumed!) population standard deviation. Click on the box labeled Perform hypothesis
test, and in the box labeled Hypothesized mean, type in the value of the mean assumed in
the null hypothesis:

https://online.stat.psu.edu/stat415/book/export/html/827 8/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

3. Click on the button labeled Options... In the pop-up window that appears, for the box
labeled Alternative, select either less than, greater than, or not equal depending on the
direction of the alternative hypothesis:

Then, click OK to return to the main pop-up window.

4. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Test of mu = 85 vs < 85
The assumed standard deviation = 11.6
95% Upper
N Mean SE Mean Bound Z P

25 80.94 2.32 84.76 -1.75 0.040

T-test for a Single Mean


To illustrate how to tell Minitab to perform a t-test for a single mean, let's refer to the systolic blood
pressure example that appeared on the page called The T-test: When Population Variance is
Unknown.
https://online.stat.psu.edu/stat415/book/export/html/827 9/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

1. Under the Stat menu, select Basic Statistics, and then 1-Sample t...:

2. In the pop-up window that appears, click on the radio button labeled Summarized data. In
the box labeled Sample size, type in the sample size n; in the box labeled Mean, type in the
sample mean; and in the box labeled Standard deviation, type in the sample standard
deviation. Click on the box labeled Perform hypothesis test, and in the box labeled
Hypothesized mean, type in the value of the mean assumed in the null hypothesis:

3. Click on the button labeled Options... In the pop-up window that appears, for the box
labeled Alternative, select either less than, greater than, or not equal depending on the
direction of the alternative hypothesis:

https://online.stat.psu.edu/stat415/book/export/html/827 10/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

Then, click OK to return to the main pop-up window.

4. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Test of mu = 120 vs not = 120


N Mean StDev SE Mean 95% CI T P

100 130.10 21.21 2.12 (125.89, 134.31) 4.76 0.000

(5) Note that a paired t-test can be performed in the same way. The summarized sample data
would simply be the summarized differences. The extra step of calculating the differences
would be required, however, if your data are the raw measurements from the two dependent
samples. That is, if you have two columns containing, say, Before and After measurements for
which you want to analyze Diff, their differences, you can use Minitab's calculator (under the
Calc menu, select Calculator) to calculate the differences:

5.

Upon clicking OK, the differences (Diff) should appear in your worksheet:

https://online.stat.psu.edu/stat415/book/export/html/827 11/12
7/22/23, 1:36 PM Lesson 10: Tests About One Mean

When performing the t-test, you'll then need to tell Minitab (in the Samples in columns box)
that the differences are contained in the Diff column:

Here's what the paired t-test output would look like for this example:

One Sample T: Diff

Test of mu = 0 vs not = 0
Variable N Mean StDev SE Mean 95% CI T P
Diff 7 2.000 1.414 0.535 (0.692, 3.308) 3.74 0.010

Legend
[1] Link

↥ Has Tooltip/Popover

Toggleable Visibility

Source: https://online.stat.psu.edu/stat415/lesson/10

Links:

https://online.stat.psu.edu/stat415/book/export/html/827 12/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

Lesson 11: Tests of the Equality of Two Means


Lesson 11: Tests of the Equality of Two Means

Overview
In this lesson, we'll continue our investigation of hypothesis testing. In this case, we'll focus our
attention on a hypothesis test for the difference in two population means for two
situations:

a hypothesis test based on the -distribution, known as the pooled two-sample -test, for
when the (unknown) population variances and are equal
a hypothesis test based on the -distribution, known as Welch's -test, for when the
(unknown) population variances and are not equal

Of course, because population variances are generally not known, there is no way of being 100%
sure that the population variances are equal or not equal. In order to be able to determine,
therefore, which of the two hypothesis tests we should use, we'll need to make some assumptions
about the equality of the variances based on our previous knowledge of the populations we're
studying.

11.1 - When Population Variances Are Equal

11.1 - When Population Variances Are Equal

Let's start with the good news, namely that we've already done the dirty theoretical work in
developing a hypothesis test for the difference in two population means when we
developed a confidence interval for the difference in two population means. Recall
that if you have two independent samples from two normal distributions with equal variances
, then:

follows a distribution where , the pooled sample variance:

is an unbiased estimator of the common variance . Therefore, if we're interested in testing the null
hypothesis:

(or equivalently )

against any of the alternative hypotheses:

https://online.stat.psu.edu/stat415/book/export/html/828 1/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

we can use the test statistic:

and follow the standard hypothesis testing procedures. Let's take a look at an example.

Example 11-1

A psychologist was interested in exploring whether or not male and female college students have
different driving behaviors. There were several ways that she could quantify driving behaviors. She
opted to focus on the fastest speed ever driven by an individual. Therefore, the particular statistical
question she framed was as follows:

Is the mean fastest speed driven by male college students different than the mean fastest speed
driven by female college students?

She conducted a survey of a random male college students and a random female
college students. Here is a descriptive summary of the results of her survey:

Males (X) Females (Y)

and here is a graphical summary of the data in the form of a dotplot:

https://online.stat.psu.edu/stat415/book/export/html/828 2/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

gender

F
56 70 84 98 112 126 140
fastest

Is there sufficient evidence at the level to conclude that the mean fastest speed driven by
male college students differs from the mean fastest speed driven by female college students?

Answer

Because the observed standard deviations of the two samples are of similar magnitude, we'll assume
that the population variances are equal. Let's also assume that the two populations of fastest speed
driven for males and females are normally distributed. (We can confirm, or deny, such an assumption
using a normal probability plot, but let's simplify our analysis for now.) The randomness of the two
samples allows us to assume independence of the measurements as well.

Okay, assumptions all met, we can test the null hypothesis:

against the alternative hypothesis:

using the test statistic:

because, among other things, the pooled sample standard deviation is:

The critical value approach tells us to reject the null hypothesis in favor of the alternative
hypothesis if:

We reject the null hypothesis because the test statistic ( ) falls in the rejection region:

https://online.stat.psu.edu/stat415/book/export/html/828 3/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

-1.9996 1.9996
3.42

There is sufficient evidence at the level to conclude that the average fastest speed driven
by the population of male college students differs from the average fastest speed driven by the
population of female college students.

Not surprisingly, the decision is the same using the -value approach. The -value is 0.0012:

Therefore, because , we reject the null hypothesis in favor of the alternative


hypothesis. Again, we conclude that there is sufficient evidence at the level to conclude
that the average fastest speed driven by the population of male college students differs from the
average fastest speed driven by the population of female college students.

By the way, we'll see how to tell Minitab to conduct a two-sample t-test in a bit here, but in the
meantime, this is what the output would look like:

Two-Sample T: For Fastest

Gender N Mean StDev SE Mean


1 34 105.5 20.1 3.4
2 29 90.9 12.2 2.3

Difference = mu (1) - mu (2)


Estimate for difference: 14.6085
95% CI for difference: (6.0630, 23.1540)
T-Test of difference = 0 (vs not =) : T-Value = 3.42 P-Value = 0.001 DF = 61
Both use Pooled StDev = 16.9066

11.2 - When Population Variances Are Not Equal

11.2 - When Population Variances Are Not Equal

Let's again start with the good news that we've already done the dirty theoretical work here. Recall
that if you have two independent samples from two normal distributions with unequal variances
, then:
https://online.stat.psu.edu/stat415/book/export/html/828 4/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

follows, at least approximately, a distribution where , the adjusted degrees of freedom is


determined by the equation:

If r doesn't equal an integer, as it usually doesn't, then we take the integer portion of . That is, we
use if necessary.

With that now being recalled, if we're interested in testing the null hypothesis:

(or equivalently )

against any of the alternative hypotheses:

we can use the test statistic:

and follow the standard hypothesis testing procedures. Let's return to our fastest speed driven
example.

Example 11-1 (Continued)

A psychologist was interested in exploring whether or not male and female college students have
different driving behaviors. There were a number of ways that she could quantify driving behaviors.
She opted to focus on the fastest speed ever driven by an individual. Therefore, the particular
statistical question she framed was as follows:
https://online.stat.psu.edu/stat415/book/export/html/828 5/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

Is the mean fastest speed driven by male college students different than the mean fastest speed
driven by female college students?

She conducted a survey of a random male college students and a random female
college students. Here is a descriptive summary of the results of her survey:

Males (X) Females (Y)

Is there sufficient evidence at the level to conclude that the mean fastest speed driven by
male college students differs from the mean fastest speed driven by female college students?

Answer

This time let's not assume that the population variances are equal. Then, we'll see if we arrive at a
different conclusion. Let's still assume though that the two populations of fastest speed driven for
males and females are normally distributed. And, we'll again permit the randomness of the two
samples to allow us to assume independence of the measurements as well.

That said, then we can test the null hypothesis:

against the alternative hypothesis:

comparing the test statistic:

to a distribution with degrees of freedom, where:

Oops... that's not an integer, so we're going to need to take the greatest integer portion of that .
That is, we take the degrees of freedom to be .

Then, the critical value approach tells us to reject the null hypothesis in favor of the alternative
hypothesis if:

https://online.stat.psu.edu/stat415/book/export/html/828 6/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

We reject the null hypothesis because the test statistic ( ) falls in the rejection region:

-2.004 2.004
3.54

There is (again!) sufficient evidence at the level to conclude that the average fastest speed
driven by the population of male college students differs from the average fastest speed driven by
the population of female college students.

And again, the decision is the same using the -value approach. The -value is 0.0008:

Therefore, because , we reject the null hypothesis in favor of the alternative


hypothesis. Again, we conclude that there is sufficient evidence at the level to conclude
that the average fastest speed driven by the population of male college students differs from the
average fastest speed driven by the population of female college students.

At any rate, we see that in this case, our conclusion is the same regardless of whether or not we
assume equality of the population variances.

And, just in case you're interested... we'll see how to tell Minitab to conduct a Welch's -test very
soon, but in the meantime, this is what the output would look like for this example:

Two-Sample T: For Fastest

Gender N Mean StDev SE Mean

1 34 105.5 20.1 3.4


2 29 90.9 12.2 2.3

Difference = mu (1) - mu (2)


Estimate for difference: 14.6085
95% CI for difference: (6.3575, 22.8596)
T-Test of difference = 0 (vs not =) : T-Value = 3.55 P-Value = 0.001 DF = 55

11.3 - Using Minitab

https://online.stat.psu.edu/stat415/book/export/html/828 7/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

11.3 - Using Minitab

Just as is the case for asking Minitab to calculate pooled t-intervals and Welch's t-intervals for
, the commands necessary for asking Minitab to perform a two-sample t-test or a Welch's t-
test depend on whether the data are entered in two columns, or the data are entered in one column
with a grouping variable in a second column.

Let's recall the spider and prey example, in which the feeding habits of two species of net-casting
spiders were studied. The species, the deinopis, and menneus coexist in eastern Australia. The
following data were obtained on the size, in millimeters, of the prey of random samples of the two
species:

Size of Random Pray Samples of the Deinopis Spider in Millimeters


sample sample sample sample sample sample sample sample sample sample
1 2 3 4 5 6 7 8 9 10

12.9 10.2 7.4 7.0 10.5 11.9 7.1 9.9 14.4 11.3
Size of Random Pray Samples of the Menneus Spider in Millimeters
sample sample sample sample sample sample sample sample sample sample
1 2 3 4 5 6 7 8 9 10

10.2 6.9 10.9 11.0 10.1 5.3 7.5 10.3 9.2 8.8

Let's use the data and Minitab to test whether the mean prey size of the populations of the two
types of spiders differs.

When the Data are Entered in Two Columns


1. Enter the data in two columns, such as:

2. Under the Stat menu, select Basic Statistics, and then select 2-Sample t...:

https://online.stat.psu.edu/stat415/book/export/html/828 8/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

3. In the pop-up window that appears, select Samples in different columns. Specify the name of
the First variable, and specify the name of the Second variable. For the two-sample (pooled) t-
test, click on the box labeled Assume equal variances. (For Welch's t-test, leave the box
labeled Assume equal variances unchecked.):

4. Click on the button labeled Options... In the pop-up window that appears, for the box
labeled Alternative, select either less than, greater than, or not equal depending on the
direction of the alternative hypothesis:

Then, click OK to return to the main pop-up window.

https://online.stat.psu.edu/stat415/book/export/html/828 9/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

5. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Two-Sample T: For Deinopis vs


Menneus
Variable N Mean StDev SE Mean
Deinopis 10 10.26 2.51 0.79

Menneus 10 9.02 1.90 0.60

Difference = mu (Deinopis) - mu (Menneus)


Estimate for difference: 1.240
95% CI for difference: (-0.852, 3.332)
T-Test of difference = 0 (vs not =): T-Value = 1.25 P-Value = 0.229 DF = 18
Both use Pooled StDev = 2.2266

When the Data are Entered in One Column, and a Grouping


Variable in a Second Column
1. Enter the data in one column (called Prey, say), and the grouping variable in a second column
(called Group, say, with 1 denoting a deinopis spider and 2 denoting a menneus spider), such
as:

2. Under the Stat menu, select Basic Statistics, and then select 2-Sample t...:

https://online.stat.psu.edu/stat415/book/export/html/828 10/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

3. In the pop-up window that appears, select Samples in one column. Specify the name of the
Samples variable (Prey, for us) and specify the name of the Subscripts (grouping) variable
(Group, for us). For the two-sample (pooled) t-test, click on the box labeled Assume equal
variances. (For Welch's t-test, leave the box labeled Assume equal variances unchecked.):

4. Click on the button labeled Options... In the pop-up window that appears, for the box
labeled Alternative, select either less than, greater than, or not equal depending on the
direction of the alternative hypothesis:

Then, click OK to return to the main pop-up window.

https://online.stat.psu.edu/stat415/book/export/html/828 11/12
7/22/23, 1:36 PM Lesson 11: Tests of the Equality of Two Means

5. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Two-Sample T: For Prey

Group N Mean StDev SE Mean


1 10 10.26 2.51 0.79

2 10 9.02 1.90 0.60

Difference = mu (1) - mu (2)


Estimate for difference: 1.240
95% CI for difference: (-0.852, 3.332)
T-Test of difference = 0 (vs not =): T-Value = 1.25 P-Value = 0.229 DF = 18
Both use Pooled StDev = 2.2266

Legend
[1] Link

↥ Has Tooltip/Popover

Toggleable Visibility

Source: https://online.stat.psu.edu/stat415/lesson/11

Links:

https://online.stat.psu.edu/stat415/book/export/html/828 12/12
7/22/23, 1:38 PM Lesson 12: Tests for Variances

Lesson 12: Tests for Variances


Lesson 12: Tests for Variances

Continuing our development of hypothesis tests for various population parameters, in this lesson, we'll
focus on hypothesis tests for population variances. Specifically, we'll develop:

a hypothesis test for testing whether a single population variance equals a particular value
a hypothesis test for testing whether two population variances are equal

12.1 - One Variance

12.1 - One Variance

Yeehah again! The theoretical work for developing a hypothesis test for a population variance is
already behind us. Recall that if you have a random sample of size n from a normal population with
(unknown) mean and variance , then:

follows a chi-square distribution with n−1 degrees of freedom. Therefore, if we're interested in testing
the null hypothesis:

against any of the alternative hypotheses:

we can use the test statistic:

and follow the standard hypothesis testing procedures. Let's take a look at an example.

Example 12-1

https://online.stat.psu.edu/stat415/book/export/html/829 1/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

A manufacturer of hard safety hats for construction workers is concerned about the mean and the
variation of the forces its helmets transmits to wearers when subjected to an external force. The
manufacturer has designed the helmets so that the mean force transmitted by the helmets to the
workers is 800 pounds (or less) with a standard deviation to be less than 40 pounds. Tests were run on a
random sample of n = 40 helmets, and the sample mean and sample standard deviation were found to
be 825 pounds and 48.5 pounds, respectively.

Do the data provide sufficient evidence, at the level, to conclude that the population standard
deviation exceeds 40 pounds?

Answer

We're interested in testing the null hypothesis:

against the alternative hypothesis:

Therefore, the value of the test statistic is:

Is the test statistic too large for the null hypothesis to be true? Well, the critical value approach would
have us finding the threshold value such that the probability of rejecting the null hypothesis if it were
true, that is, of committing a Type I error, is small... 0.05, in this case. Using Minitab (or a chi-square
probability table), we see that the cutoff value is 54.572:

54.572

That is, we reject the null hypothesis in favor of the alternative hypothesis if the test statistic is
greater than 54.572. It is. That is, the test statistic falls in the rejection region:

https://online.stat.psu.edu/stat415/book/export/html/829 2/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

54.572
57.336

Therefore, we conclude that there is sufficient evidence, at the 0.05 level, to conclude that the
population standard deviation exceeds 40.

Of course, the P-value approach yields the same conclusion. In this case, the P-value is the probablity
that we would observe a chi-square(39) random variable more extreme than 57.336:

P-value = 0.029

57.336

As the drawing illustrates, the P-value is 0.029 (as determined using the chi-square probability
calculator in Minitab). Because , we reject the null hypothesis in favor of the
alternative hypothesis.

Do the data provide sufficient evidence, at the level, to conclude that the population standard
deviation differs from 40 pounds?

Answer

In this case, we're interested in testing the null hypothesis:

against the alternative hypothesis:

The value of the test statistic remains the same. It is again:

https://online.stat.psu.edu/stat415/book/export/html/829 3/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

Now, is the test statistic either too large or too small for the null hypothesis to be true? Well, the critical
value approach would have us dividing the significance level into 2, to get 0.025, and putting
one of the halves in the left tail, and the other half in the other tail. Doing so (and using Minitab to get
the cutoff values), we get that the lower cutoff value is 23.654 and the upper cutoff value is 58.120:

23.654 58.120

That is, we reject the null hypothesis in favor of the two-sided alternative hypothesis if the test statistic
is either smaller than 23.654 or greater than 58.120. It is not. That is, the test statistic does not fall in
the rejection region:

23.654 58.120
57.336

Therefore, we fail to reject the null hypothesis. There is insufficient evidence, at the 0.05 level, to
conclude that the population standard deviation differs from 40.

Of course, the P-value approach again yields the same conclusion. In this case, we simply double the
P-value we obtained for the one-tailed test yielding a P-value of 0.058:

Because , we fail to reject the null hypothesis in favor of the two-sided alternative
hypothesis.

https://online.stat.psu.edu/stat415/book/export/html/829 4/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

The above example illustrates an important fact, namely, that the conclusion for the one-sided test does
not always agree with the conclusion for the two-sided test. If you have reason to believe that the
parameter will differ from the null value in a particular direction, then you should conduct the one-
sided test.

12.2 - Two Variances

12.2 - Two Variances

Let's now recall the theory necessary for developing a hypothesis test for testing the equality of two
population variances. Suppose is a random sample of size n from a normal population
with mean and variance . And, suppose, independent of the first sample, is
another random sample of size m from a normal population with and variance . Recall then, in
this situation, that:

have independent chi-square distributions with n−1 and m−1 degrees of freedom, respectively.
Therefore:

follows an F distribution with n−1 numerator degrees of freedom and m−1 denominator degrees of
freedom. Therefore, if we're interested in testing the null hypothesis:

(or equivalently )

against any of the alternative hypotheses:

we can use the test statistic:

and follow the standard hypothesis testing procedures. When doing so, we might also want to recall
this important fact about the F-distribution:

so that when we use the critical value approach for a two-sided alternative:

we reject if the test statistic F is too large:

https://online.stat.psu.edu/stat415/book/export/html/829 5/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

or if the test statistic F is too small:

Okay, let's take a look at an example. In the last lesson, we performed a two-sample t-test (as well as
Welch's test) to test whether the mean fastest speed driven by the population of male college students
differs from the mean fastest speed driven by the population of female college students. When we
performed the two-sample t-test, we just assumed the population variances were equal. Let's revisit
that example again to see if our assumption of equal variances is valid.

Example 12-2

A psychologist was interested in exploring whether or not male and female college students have
different driving behaviors. The particular statistical question she framed was as follows:

Is the mean fastest speed driven by male college students different than the mean fastest speed driven
by female college students?

The psychologist conducted a survey of a random male college students and a random
female college students. Here is a descriptive summary of the results of her survey:

Males (X) Females (Y)

Is there sufficient evidence at the level to conclude that the variance of the fastest speed
driven by male college students differs from the variance of the fastest speed driven by female college
students?

Answer

We're interested in testing the null hypothesis:

https://online.stat.psu.edu/stat415/book/export/html/829 6/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

against the alternative hypothesis:

The value of the test statistic is:

(Note that I intentionally put the variance of what we're calling the Y sample in the numerator and the
variance of what we're calling the X sample in the denominator. I did this only so that my results match
the Minitab output we'll obtain on the next page. In doing so, we just need to make sure that we keep
track of the correct numerator and denominator degrees of freedom.) Using the critical value
approach, we divide the significance level into 2, to get 0.025, and put one of the halves in
the left tail, and the other half in the other tail. Doing so, we get that the lower cutoff value is 0.478 and
the upper cutoff value is 2.0441:

Because the test statistic falls in the rejection region, that is, because , we reject the
null hypothesis in favor of the alternative hypothesis. There is sufficient evidence at the level
to conclude that the population variances are not equal. Therefore, the assumption of equal variances
that we made when performing the two-sample t-test on these data in the previous lesson does not
appear to be valid. It would behoove us to use Welch's t-test instead.

12.3 - Using Minitab

12.3 - Using Minitab

In each case, we'll illustrate how to perform the hypothesis tests of this lesson using summarized data.

https://online.stat.psu.edu/stat415/book/export/html/829 7/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

Hypothesis Test for One Variance


1. Under the Stat menu, select Basic Statistics, and then select 1 Variance...:

2. In the pop-up window that appears, in the box labeled Data, select Sample standard deviation
(or alternatively Sample variance). In the box labeled Sample size, type in the size n of the
sample. In the box labeled Sample standard deviation, type in the sample standard deviation.
Click on the box labeled Perform hypothesis test, and in the box labeled Value, type in the
Hypothesized standard deviation (or alternatively the Hypothesized variance):

3. Click on the button labeled Options... In the pop-up window that appears, for the box labeled
Alternative, select either less than, greater than, or not equal depending on the direction of the
alternative hypothesis:

https://online.stat.psu.edu/stat415/book/export/html/829 8/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

Then, click on OK to return to the main pop-up window.

4. Then, upon clicking OK on the main pop-up window, the output should appear in the Session
window:

95% Confidence Intervals


CI for CI for
Method
StDev Variance
Chi-Square (39.7, 62.3) (1578, 3878)
Tests
Test
Method DF P-Value
Statistic
Chi-Square 57.34 39 0.059

Hypothesis Test for Two Variances


1. Under the Stat menu, select Basic Statistics, and then select 2 Variances...:

https://online.stat.psu.edu/stat415/book/export/html/829 9/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

2. In the pop-up window that appears, in the box labeled Data, select Sample standard deviations
(or alternatively Sample variances). In the box labeled Sample size, type in the size n of the First
sample and m of the Second sample. In the box labeled Standard deviation, type in the sample
standard deviations for the First and Second samples:

3. Click on the button labeled Options... In the pop-up window that appears, in the box labeled
Value, type in the Hypothesized ratio of the standard deviations (or the Hypothesized ratio of
the variances). For the box labeled Alternative, select either less than, greater than, or not
equal depending on the direction of the alternative hypothesis:

Then, click on OK to return to the main pop-up window.

1. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Test and CI for Two Variances

Method

Null hypothesis Sigma(1) / Sigma(2) = 1


Alternative hypothesis Sigma(1) / Sigma(2) not = 1
Significance level Alpha = 0.05

Statistics

https://online.stat.psu.edu/stat415/book/export/html/829 10/11
7/22/23, 1:38 PM Lesson 12: Tests for Variances

Sample N StDev Variance

1 29 12.200 148.840
2 34 20.100 404.010

Ratio of standard deviations = 0.607


Ratio of variances = 0.368

95% Confidence Intervals

Distribution CI for StDev CI for


of Data Ratio Variance Ratio

Normal (0.425, 0.877) (0.180, 0.770)

Tests
Test
Method DF1 DF2 P-Value
Statistic
F Test (normal) 28 33 0.37 0.009

Legend
[1] Link

↥ Has Tooltip/Popover

Toggleable Visibility

Source: https://online.stat.psu.edu/stat415/lesson/12

Links:

https://online.stat.psu.edu/stat415/book/export/html/829 11/11
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Lesson 9: Tests About Proportions


Lesson 9: Tests About Proportions

We'll start our exploration of hypothesis tests by focusing on population proportions. Specifically,
we'll derive the methods used for testing:

1. whether a single population proportion equals a particular value,


2. whether the difference in two population proportions equals a particular value , say,
with the most common value being 0

Thereby allowing us to test whether two populations' proportions are equal. Along the way, we'll
learn two different approaches to hypothesis testing, one being the critical value approach and one
being the -value approach.

9.1 - The Basic Idea

9.1 - The Basic Idea

Every time we perform a hypothesis test, this is the basic procedure that we will follow:

1. We'll make an initial assumption about the population parameter.


2. We'll collect evidence or else use somebody else's evidence (in either case, our evidence will
come in the form of data).
3. Based on the available evidence (data), we'll decide whether to "reject" or "not reject" our
initial assumption.

Let's try to make this outlined procedure more concrete by taking a look at the following example.

Example 9-1

A four-sided (tetrahedral) die is tossed 1000 times, and 290 fours are observed. Is there evidence to
conclude that the die is biased, that is, say, that more fours than expected are observed?

https://online.stat.psu.edu/stat415/book/export/html/826 1/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Answer

As the basic hypothesis testing procedure outlines above, the first step involves stating an initial
assumption. It is:

Assume the die is unbiased. If the die is unbiased, then each side (1, 2, 3, and 4) is equally likely. So,
we'll assume that p, the probability of getting a 4 is 0.25.

In general, the initial assumption is called the null hypothesis, and is denoted . (That's a zero in
the subscript for "null"). In statistical notation, we write the initial assumption as:

That is, the initial assumption involves making a statement about a population proportion.

Now, the second step tells us that we need to collect evidence (data) for or against our initial
assumption. In this case, that's already been done for us. We were told that the die was tossed
times, and fours were observed. Using statistical notation again, we write the
collected evidence as a sample proportion:

Now we just need to complete the third step of making the decision about whether or not to reject
our initial assumption that the population proportion is 0.25. Recall that the Central Limit Theorem
tells us that the sample proportion:

is approximately normally distributed with (assumed) mean:

and (assumed) standard deviation:

That means that:

follows a standard normal distribution. So, we can "translate" our observed sample
proportion of 0.290 onto the scale. Here's a picture that summarizes the situation:

https://online.stat.psu.edu/stat415/book/export/html/826 2/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

0.25 0.290

2.92

So, we are assuming that the population proportion is 0.25 (in blue), but we've observed a sample
proportion 0.290 (in red) that falls way out in the right tail of the normal distribution. It certainly
doesn't appear impossible to obtain a sample proportion of 0.29. But, that's what we're left with
deciding. That is, we have to decide if a sample proportion of 0.290 is more extreme that we'd
expect if the population proportion does indeed equal 0.25.

There are two approaches to making the decision:

1. one is called the "critical value" (or "critical region" or "rejection region") approach
2. and the other is called the " -value" approach

Until we get to the page in this lesson titled The -value Approach, we'll use the critical value
approach.

Example (continued)

A four-sided (tetrahedral) die is tossed 1000 times, and 290 fours are observed. Is there evidence to
conclude that the die is biased, that is, say, that more fours than expected are observed?

Answer

Okay, so now let's think about it. We probably wouldn't reject our initial assumption that the
population proportion if our observed sample proportion were 0.255. And, we might still
not be inclined to reject our initial assumption that the population proportion if our
observed sample proportion were 0.27. On the other hand, we would almost certainly want to reject
our initial assumption that the population proportion if our observed sample proportion
were 0.35. That suggests, then, that there is some "threshold" value that once we "cross" the
threshold value, we are inclined to reject our initial assumption. That is the critical value approach in

https://online.stat.psu.edu/stat415/book/export/html/826 3/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

a nutshell. That is, critical value approach tells us to define a threshold value, called a "critical
value" so that if our "test statistic" is more extreme than the critical value, then we reject the null
hypothesis.

Let's suppose that we decide to reject the null hypothesis in favor of the "alternative
hypothesis" if:

or equivalently if

Here's a picture of such a "critical region" (or "rejection region"):

0.05

0.25 0.273

1.645

Note, by the way, that the "size" of the critical region is 0.05. This will become apparent in a bit when
we talk below about the possible errors that we can make whenever we conduct a hypothesis test.

At any rate, let's get back to deciding whether our particular sample proportion appears to be too
extreme. Well, it looks like we should reject the null hypothesis (our initial assumption )
because:

or equivalently since our test statistic:

is greater than 1.645.

Our conclusion: we say there is sufficient evidence to conclude , that is, that the die is
biased.

By the way, this example involves what is called a one-tailed test, or more specifically, a right-tailed
test, because the critical region falls in only one of the two tails of the normal distribution, namely
the right tail.

https://online.stat.psu.edu/stat415/book/export/html/826 4/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Before we continue on the next page at looking at two more examples, let's revisit the basic
hypothesis testing procedure that we outlined above. This time, though, let's state the procedure in
terms of performing a hypothesis test for a population proportion using the critical value
approach. The basic procedure is:

1. State the null hypothesis and the alternative hypothesis . (By the way, some
textbooks, including ours, use the notation instead of to denote the alternative
hypothesis.)
2. Calculate the test statistic:
3.

4. Determine the critical region.


5. Make a decision. Determine if the test statistic falls in the critical region. If it does, reject the
null hypothesis. If it does not, do not reject the null hypothesis.

Now, back to those possible errors we can make when conducting such a hypothesis test.

Possible Errors
So, argh! Every time we conduct a hypothesis test, we have a chance of making an error. (Oh dear,
why couldn't I have chosen a different profession?!)

1. If we reject the null hypothesis (in favor of the alternative hypothesis ) when the null
hypothesis is in fact true, we say we've committed a Type I error. For our example above, we
set P(Type I error) equal to 0.05:

0.25 0.273

1.645

Aha! That's why the 0.05! We wanted to minimize our chance of making a Type I error! In
general, we denote the "significance level of the test." Obviously,
we want to minimize . Therefore, typical values are 0.01, 0.05, and 0.10.

2. If we fail to reject the null hypothesis when the null hypothesis is false, we say we've
committed a Type II error. For our example, suppose (unknown to us) that the population
proportion is actually 0.27. Then, the probability of a Type II error, in this case, is:
https://online.stat.psu.edu/stat415/book/export/html/826 5/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

In general, we denote . Just as we want to minimize


, we want to minimize . Typical values are 0.05,
0.10, and 0.20.

9.2 - More Examples

9.2 - More Examples

Let's take a look at two more examples of a hypothesis test for a single proportion while recalling
the hypothesis testing procedure we outlined on the previous page:

1. State the null hypothesis and the alternative hypothesis .

2. Calculate the test statistic:

3. Determine the critical region.

4. Make a decision. Determine if the test statistic falls in the critical region. If it does, reject the
null hypothesis. If it does not, do not reject the null hypothesis.

The first example involves a hypothesis test for the proportion in which the alternative hypothesis is
a "greater than hypothesis," that is, the alternative hypothesis is of the form . And, the
second example involves a hypothesis test for the proportion in which the alternative hypothesis is a
"less than hypothesis," that is, the alternative hypothesis is of the form .

https://online.stat.psu.edu/stat415/book/export/html/826 6/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Example 9-2

Let p equal the proportion of drivers who use a seat belt in a state that does not have a mandatory
seat belt law. It was claimed that . An advertising campaign was conducted to increase this
proportion. Two months after the campaign, out of a random sample of drivers
were wearing seat belts. Was the campaign successful?

Answer

The observed sample proportion is:

Because it is claimed that , the null hypothesis is:

Because we're interested in seeing if the advertising campaign was successful, that is, that a greater
proportion of people wear seat belts, the alternative hypothesis is:

The test statistic is therefore:

If we use a significance level of , then the critical region is:

https://online.stat.psu.edu/stat415/book/export/html/826 7/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

α = 0.01

Z
2.326

That is, we reject the null hypothesis if the test statistic . Because the test statistic falls in
the critical region, that is, because , we can reject the null hypothesis in favor of
the alternative hypothesis. There is sufficient evidence at the level to conclude the
campaign was successful ( ).

Again, note that this is an example of a right-tailed hypothesis test because the action falls in the
right tail of the normal distribution.

Example 9-3

A Gallup poll released on October 13, 2000, found that 47% of the 1052 U.S. adults surveyed
classified themselves as "very happy" when given the choices of:

"very happy"
"fairly happy"
"not too happy"

Suppose that a journalist who is a pessimist took advantage of this poll to write a headline titled
"Poll finds that U.S. adults who are very happy are in the minority." Is the pessimistic journalist's
headline warranted?

Answer

The sample proportion is:

https://online.stat.psu.edu/stat415/book/export/html/826 8/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Because we're interested in the majority/minority boundary line, the null hypothesis is:

Because the journalist claims that the proportion of very happy U.S. adults is a minority, that is, less
than 0.50, the alternative hypothesis is:

The test statistic is therefore:

Now, this time, we need to put our critical region in the left tail of the normal distribution. If we use a
significance level of , then the critical region is:

α = 0.05

Z
-1.645

That is, we reject the null hypothesis if the test statistic . Because the test statistic falls in
the critical region, that is, because , we can reject the null hypothesis in favor
of the alternative hypothesis. There is sufficient evidence at the level to conclude that
, that is, U.S. adults who are very happy are in the minority. The journalist's pessimism
appears to be indeed warranted.

Note that this is an example of a left-tailed hypothesis test because the action falls in the left tail of
the normal distribution.

9.3 - The P-Value Approach

9.3 - The P-Value Approach

https://online.stat.psu.edu/stat415/book/export/html/826 9/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Example 9-4

Up until now, we have used the critical region approach in conducting our hypothesis tests. Now,
let's take a look at an example in which we use what is called the P-value approach.

Among patients with lung cancer, usually, 90% or more die within three years. As a result of new
forms of treatment, it is felt that this rate has been reduced. In a recent study of n = 150 lung cancer
patients, y = 128 died within three years. Is there sufficient evidence at the level, say, to
conclude that the death rate due to lung cancer has been reduced?

Answer

The sample proportion is:

The null and alternative hypotheses are:

and

The test statistic is, therefore:

And, the rejection region is:

https://online.stat.psu.edu/stat415/book/export/html/826 10/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

α = 0.05

P
0.90
Z
-1.645 0

Since the test statistic Z = −1.92 < −1.645, we reject the null hypothesis. There is sufficient evidence
at the level to conclude that the rate has been reduced.

Example 9-4 (continued)

What if we set the significance level = P(Type I Error) to 0.01? Is there still sufficient evidence to
conclude that the death rate due to lung cancer has been reduced?

Answer

In this case, with , the rejection region is Z ≤ −2.33. That is, we reject if the test statistic falls
in the rejection region defined by Z ≤ −2.33:

https://online.stat.psu.edu/stat415/book/export/html/826 11/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

α = 0.01

P
0.90
Z
-2.33 0

Because the test statistic Z = −1.92 > −2.33, we do not reject the null hypothesis. There is insufficient
evidence at the level to conclude that the rate has been reduced.

Example 9-4 (continued)

In the first part of this example, we rejected the null hypothesis when . And, in the second
part of this example, we failed to reject the null hypothesis when . There must be some
level of , then, in which we cross the threshold from rejecting to not rejecting the null hypothesis.
What is the smallest that would still cause us to reject the null hypothesis?

Answer

We would, of course, reject any time the critical value was smaller than our test statistic −1.92:

https://online.stat.psu.edu/stat415/book/export/html/826 12/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Z
-2.33 -1.645 0
-1.92

That is, we would reject if the critical value were −1.645, −1.83, and −1.92. But, we wouldn't reject if
the critical value were −1.93. The associated with the test statistic −1.92 is called the P-
value. It is the smallest that would lead to rejection. In this case, the P-value is:

P(Z < −1.92) = 0.0274

So far, all of the examples we've considered have involved a one-tailed hypothesis test in which the
alternative hypothesis involved either a less than (<) or a greater than (>) sign. What happens if we
weren't sure of the direction in which the proportion could deviate from the hypothesized null
value? That is, what if the alternative hypothesis involved a not-equal sign (≠)? Let's take a look at an
example.

Example 9-4 (continued)

What if we wanted to perform a "two-tailed" test? That is, what if we wanted to test:

versus

at the level?

https://online.stat.psu.edu/stat415/book/export/html/826 13/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Answer

Let's first consider the critical value approach. If we allow for the possibility that the sample
proportion could either prove to be too large or too small, then we need to specify a threshold
value, that is, a critical value, in each tail of the distribution. In this case, we divide the "significance
level" by 2 to get :

Z
-1.96 0 1.96

That is, our rejection rule is that we should reject the null hypothesis or we should
reject the null hypothesis . Alternatively, we can write that we should reject the null
hypothesis . Because our test statistic is −1.92, we just barely fail to reject the null
hypothesis, because 1.92 < 1.96. In this case, we would say that there is insufficient evidence at the
level to conclude that the sample proportion differs significantly from 0.90.

Now for the P-value approach. Again, needing to allow for the possibility that the sample
proportion is either too large or too small, we multiply the P-value we obtain for the one-tailed test
by 2:

0.0274 0.0274

Z
-1.92 0 1.92

That is, the P-value is:

Because the P-value 0.055 is (just barely) greater than the significance level , we barely fail
to reject the null hypothesis. Again, we would say that there is insufficient evidence at the
level to conclude that the sample proportion differs significantly from 0.90.
https://online.stat.psu.edu/stat415/book/export/html/826 14/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Let's close this example by formalizing the definition of a P-value, as well as summarizing the P-
value approach to conducting a hypothesis test.

P-Value

The P-value is the smallest significance level that leads us to reject the null hypothesis.

Alternatively (and the way I prefer to think of P-values), the P-value is the probability that we'd
observe a more extreme statistic than we did if the null hypothesis were true.

If the P-value is small, that is, if , then we reject the null hypothesis .

Note!

By the way, to test , some statisticians will use the test statistic:

rather than the one we've been using:

One advantage of doing so is that the interpretation of the confidence interval — does it contain ?
— is always consistent with the hypothesis test decision, as illustrated here:

Answer

For the sake of ease, let:

https://online.stat.psu.edu/stat415/book/export/html/826 15/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Two-tailed test. In this case, the critical region approach tells us to reject the null hypothesis
against the alternative hypothesis :

if or if

which is equivalent to rejecting the null hypothesis:

if or if

which is equivalent to rejecting the null hypothesis:

if or if

That's the same as saying that we should reject the null hypothesis is not in the
confidence interval!

Left-tailed test. In this case, the critical region approach tells us to reject the null hypothesis
against the alternative hypothesis :

if

which is equivalent to rejecting the null hypothesis:

if

which is equivalent to rejecting the null hypothesis:

if

That's the same as saying that we should reject the null hypothesis is not in the upper
confidence interval:

9.4 - Comparing Two Proportions

9.4 - Comparing Two Proportions

So far, all of our examples involved testing whether a single population proportion p equals some
value . Now, let's turn our attention for a bit towards testing whether one population proportion
equals a second population proportion . Additionally, most of our examples thus far have
involved left-tailed tests in which the alternative hypothesis involved or right-tailed tests
in which the alternative hypothesis involved . Here, let's consider an example that tests
the equality of two proportions against the alternative that they are not equal. Using statistical
notation, we'll test:

versus

https://online.stat.psu.edu/stat415/book/export/html/826 16/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Example 9-5

Time magazine reported the result of a telephone poll of 800 adult Americans. The question posed
of the Americans who were surveyed was: "Should the federal tax on cigarettes be raised to pay for
health care reform?" The results of the survey were:

Non- Smokers Smokers

Is there sufficient evidence at the , say, to conclude that the two populations — smokers
and non-smokers — differ significantly with respect to their opinions?

Answer

If = the proportion of the non-smoker population who reply "yes" and = the proportion of the
smoker population who reply "yes," then we are interested in testing the null hypothesis:

against the alternative hypothesis:

Before we can actually conduct the hypothesis test, we'll have to derive the appropriate test statistic.

Theorem

The test statistic for testing the difference in two population proportions, that is, for testing the null
hypothesis is:

https://online.stat.psu.edu/stat415/book/export/html/826 17/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

where:

the proportion of "successes" in the two samples combined.

Proof

Recall that:

is approximately normally distributed with mean:

and variance:

But, if we assume that the null hypothesis is true, then the population proportions equal some
common value p, say, that is, . In that case, then the variance becomes:

So, under the assumption that the null hypothesis is true, we have that:

follows (at least approximately) the standard normal N(0,1) distribution. Since we don't know the
(assumed) common population proportion p any more than we know the proportions and of
each population, we can estimate p using:

the proportion of "successes" in the two samples combined. And, hence, our test statistic becomes:

https://online.stat.psu.edu/stat415/book/export/html/826 18/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

as was to be proved.

Example 9-5 (continued)

Time magazine reported the result of a telephone poll of 800 adult Americans. The question posed
of the Americans who were surveyed was: "Should the federal tax on cigarettes be raised to pay for
health care reform?" The results of the survey were:

Non- Smokers Smokers

Is there sufficient evidence at the , say, to conclude that the two populations — smokers
and non-smokers — differ significantly with respect to their opinions?

Answer

The overall sample proportion is:

That implies then that the test statistic for testing:

versus

is:

https://online.stat.psu.edu/stat415/book/export/html/826 19/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Errr.... that Z-value is off the charts, so to speak. Let's go through the formalities anyway making the
decision first using the rejection region approach, and then using the P-value approach. Putting half
of the rejection region in each tail, we have:

Z
-1.96 0 1.96

That is, we reject the null hypothesis if or if . We clearly reject , since


8.99 falls in the "red zone," that is, 8.99 is (much) greater than 1.96. There is sufficient evidence at the
0.05 level to conclude that the two populations differ with respect to their opinions concerning
imposing a federal tax to help pay for health care reform.

Now for the P-value approach:

Z
-8.99 8.99

That is, the P-value is less than 0.0001. Because , we reject the null
hypothesis. Again, there is sufficient evidence at the 0.05 level to conclude that the two populations
differ with respect to their opinions concerning imposing a federal tax to help pay for health care
reform.

Thankfully, as should always be the case, the two approaches.... the critical value approach and the
P-value approach... lead to the same conclusion

https://online.stat.psu.edu/stat415/book/export/html/826 20/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Note!

For testing , some statisticians use the test statistic:

instead of the one we used:

An advantage of doing so is again that the interpretation of the confidence interval — does it
contain 0? — is always consistent with the hypothesis test decision.

9.5 - Using Minitab

9.5 - Using Minitab

Hypothesis Test for a Single Proportion


To illustrate how to tell Minitab to perform a Z-test for a single proportion, let's refer to the lung
cancer example that appeared on the page called The P-Value Approach.

1. Under the Stat menu, select Basic Statistics, and then 1 Proportion...:

https://online.stat.psu.edu/stat415/book/export/html/826 21/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

2. In the pop-up window that appears, click on the radio button labeled Summarized data. In
the box labeled Number of events, type in the number of successes or events of interest, and
in the box labeled Number of trials, type in the sample size n. Click on the box labeled
Perform hypothesis test, and in the box labeled Hypothesized proportion, type in the value
of the proportion assumed in the null hypothesis:

3. Click on the button labeled Options... In the pop-up window that appears, for the box
labeled Alternative, select either less than, greater than, or not equal depending on the
direction of the alternative hypothesis. Click on the box labeled Use test and interval based
on normal distribution:

https://online.stat.psu.edu/stat415/book/export/html/826 22/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Then, click OK to return to the main pop-up window.

4. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Test of P = 0.9 vs p < 0.9


95% Upper Z- P-
Sample X N Sample P Bound Value Value
1 128 150 0.853333 0.900846 -1.91 0.028

Using the normal approximation.

As you can see, Minitab reports not only the value of the test statistic (Z = −1.91) but also the
P-value (0.028) and the 95% confidence interval (one-sided in this case, because of the one-
sided hypothesis).

Hypothesis Test for Comparing Two Proportions


To illustrate how to tell Minitab to perform a Z-test for comparing two population proportions, let's
refer to the smoker survey example that appeared on the page called Comparing Two Proportions.

1. Under the Stat menu, select Basic Statistics, and then 2 Proportions...:

https://online.stat.psu.edu/stat415/book/export/html/826 23/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

2. In the pop-up window that appears, click on the radio button labeled Summarized data. In
the boxes labeled Events, type in the number of successes or events of interest for both the
First and Second samples. And in the boxes labeled Trials, type in the size of the First
sample and the size of the Second sample:

3. Click on the button labeled Options... In the pop-up window that appears, in the box
labeled Test difference, type in the assumed value of the difference in the proportions that
appears in the null hypothesis. The default value is 0.0, the value most commonly assumed, as
it means that we are interested in testing for the equality of the population proportions. For
the box labeled Alternative, select either less than, greater than, or not equal depending on
the direction of the alternative hypothesis. Click on the box labeled Use pooled estimate of p
for test:

Then, click OK to return to the main pop-up window.

4. Then, upon clicking OK on the main pop-up window, the output should appear in the
Session window:

Sample X N Sample P
1 351 605 0.580165
https://online.stat.psu.edu/stat415/book/export/html/826 24/25
7/22/23, 1:38 PM Lesson 9: Tests About Proportions

Sample X N Sample P
2 41 195 0.210256

Difference = p (1) - p (2)


Estimate for difference: 0.369909
95% CI for difference: (0.0300499, 0.439319)
T-Test of difference = 0 (vs not =0): Z = 8.99 P-Value = 0.000

Fischer's exact test: P-Value = 0.000

Again, as you can see, Minitab reports not only the value of the test statistic (Z = 8.99) but
other useful things as well, including the P-value, which in this case is so small as to be deemed
to be 0.000 to three digits. For scientific reporting purposes, we would typically write that as P
< 0.0001.

Legend
[1] Link

↥ Has Tooltip/Popover

Toggleable Visibility

Source: https://www.google.com/

Links:

https://online.stat.psu.edu/stat415/book/export/html/826 25/25
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

Lesson 15: Tests Concerning Regression and Correlation


Lesson 15: Tests Concerning Regression and Correlation

Overview
In lessons 35 and 36, we learned how to calculate point and interval estimates of the intercept and
slope parameters, and , of a simple linear regression model:

with the random errors following a normal distribution with mean 0 and variance . In this
lesson, we'll learn how to conduct a hypothesis test for testing the null hypothesis that the slope
parameter equals some value, , say. Specifically, we'll learn how to test the null hypothesis
using a -statistic.

Now, perhaps it is not a point that has been emphasized yet, but if you take a look at the form of the
simple linear regression model, you'll notice that the response 's are denoted using a capital letter,
while the predictor 's are denoted using a lowercase letter. That's because, in the simple linear
regression setting, we view the predictors as fixed values, whereas we view the responses as random
variables whose possible values depend on the population from which they came. Suppose
instead that we had a situation in which we thought of the pair as being a random sample,
, from a bivariate normal distribution with parameters , , , and . Then,
we might be interested in testing the null hypothesis , because we know that if the
correlation coefficient is 0, then and are independent random variables. For this reason, we'll
learn, not one, but three (!) possible hypothesis tests for testing the null hypothesis that the
correlation coefficient is 0. Then, because we haven't yet derived an interval estimate for the
correlation coefficient, we'll also take the time to derive an approximate confidence interval for .

15.1 - A Test for the Slope

15.1 - A Test for the Slope

Once again we've already done the bulk of the theoretical work in developing a hypothesis test for
the slope parameter of a simple linear regression model when we developed a
confidence interval for . We had shown then that:

follows a distribution. Therefore, if we're interested in testing the null hypothesis:

against any of the alternative hypotheses:

, ,

https://online.stat.psu.edu/stat415/book/export/html/830 1/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

we can use the test statistic:

and follow the standard hypothesis testing procedures. Let's take a look at an example.

Example 15-1

In alligators' natural habitat, it is typically easier to observe the length of an alligator than it is the
weight. This data set contains the log weight ( ) and log length ( ) for 15 alligators captured in
central Florida. A scatter plot of the data suggests that there is a linear relationship between the
response and the predictor . Therefore, a wildlife researcher is interested in fitting the linear
model: [1] [2]

to the data. She is particularly interested in testing whether there is a relationship between the
length and weight of alligators. At the level, perform a test of the null hypothesis
against the alternative hypothesis .

Answer

The easiest way to perform the hypothesis test is to let Minitab do the work for us! Under the Stat
menu, selecting Regression, and then Regression, and specifying the response logW (for log weight)
and the predictor logL (for log length), we get:

The regression equation is


logW = - 8.48 + 3.43 logL

Predictor Coef SE Coef T P


Constant -8.4761 0.5007 -16.93 0.000
logL 3.4311 0.1330 25.80 0.000

Analysis of Variance

https://online.stat.psu.edu/stat415/book/export/html/830 2/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

Source DF SS MS F P
Regression 1 10.064 10.064 665.81 0.000

Residual Error 13 0.196 0.015

Total 14 10.260

Easy as pie! Minitab tells us that the test statistic is (in blue) with a -value (0.000) that is
less than 0.001. Because the -value is less than 0.05, we reject the null hypothesis at the 0.05 level.
There is sufficient evidence to conclude that the slope parameter does not equal 0. That is, there is
sufficient evidence, at the 0.05 level, to conclude that there is a linear relationship, among the
population of alligators, between the log length and log weight.

Of course, since we are learning this material for just the first time, perhaps we could go through the
calculation of the test statistic at least once. Letting Minitab do some of the dirtier calculations for
us, such as calculating:

as well as determining that and that the slope estimate = 3.4311, we get:

which is the test statistic that Minitab calculated... well, with just a bit of round-off error.

15.2 - Three Tests for Rho

15.2 - Three Tests for Rho

The hypothesis test for the slope that we developed on the previous page was developed under
the assumption that a response is a linear function of a nonrandom predictor . This situation
occurs when the researcher has complete control of the values of the variable . For example, a
researcher might be interested in modeling the linear relationship between the temperature of an
oven and the moistness of chocolate chip muffins. In this case, the researcher sets the oven
temperatures (in degrees Fahrenheit) to 350, 360, 370, and so on, and then observes the values of
the random variable , that is, the moistness of the baked muffins. In this case, the linear model:

https://online.stat.psu.edu/stat415/book/export/html/830 3/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

implies that the average moistness:

is a linear function of the temperature setting.

There are other situations, however, in which the variable is not nonrandom (yes, that's a double
negative!), but rather an observed value of a random variable . For example, a fisheries researcher
may want to relate the age of a sardine to its length . If a linear relationship could be
established, then in the future fisheries researchers could predict the age of a sardine simply by
measuring its length. In this case, the linear model:

implies that the average age of a sardine, given its length is :

is a linear function of the length. That is, the conditional mean of given is a linear function.
Now, in this second situation, in which both and are deemed random, we typically assume that
the pairs are a random sample from a bivariate normal
distribution with means and , variances and , and correlation coefficient . If that's the
case, it can be shown that the conditional mean:

must be of the form:

That is:

Now, for the case where has a bivariate distribution, the researcher may not necessarily be
interested in estimating the linear function:

but rather simply knowing whether and are independent. In STAT 414, we've learned that if
follows a bivariate normal distribution, then testing for the independence of and is
equivalent to testing whether the correlation coefficient equals 0. We'll now work on developing
three different hypothesis tests for testing assuming follows a bivariate normal
distribution.

https://online.stat.psu.edu/stat415/book/export/html/830 4/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

A T-Test for Rho


Given our wordy prelude above, this test may be the simplest of all of the tests to develop. That's
because we argued above that if follows a bivariate normal distribution, and the conditional
mean is a linear function:

then:

That suggests, therefore, that testing for against any of the alternative hypotheses
, and is equivalent to testing against the
corresponding alternative hypothesis , and . That is, we can
simply compare the test statistic:

to a distribution with degrees of freedom. It should be noted, though, that the test statistic
can be instead written as a function of the sample correlation coefficient:

That is, the test statistic can be alternatively written as:

and because of its algebraic equivalence to the first test statistic, it too follows a distribution with
degrees of freedom. Huh? How are the two test statistics algebraically equivalent? Well, if the
following two statements are true:

1.

2.

then simple algebra illustrates that the two test statistics are indeed algebraically equivalent:

https://online.stat.psu.edu/stat415/book/export/html/830 5/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

Now, for the veracity of those two statements? Well, they are indeed true. The first one requires just
some simple algebra. The second one requires a bit of trickier algebra that you'll soon be asked to
work through for homework.

An R-Test for Rho


It would be nice to use the sample correlation coefficient as a test statistic to test more general
hypotheses about the population correlation coefficient:

but the probability distribution of is difficult to obtain. It turns out though that we can derive a
hypothesis test using just provided that we are interested in testing the more specific null
hypothesis that and are independent, that is, for testing .

Theorem

Provided that , the probability density function of the sample correlation coefficient is:

over the support .

Proof

We'll use the distribution function technique, in which we first find the cumulative distribution
function , and then differentiate it to get the desired probability density function . The
cumulative distribution function is:

The first equality is just the definition of the cumulative distribution function, while the second and
third equalities come from the definition of the statistic as a function of the sample correlation
coefficient . Now, using what we know of the p.d.f. of a random variable with
degrees of freedom, we get:

Now, it's just a matter of taking the derivative of the c.d.f. to get the p.d.f. ). Using the
Fundamental Theorem of Calculus, in conjunction with the chain rule, we get:

https://online.stat.psu.edu/stat415/book/export/html/830 6/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

Focusing first on the derivative part of that equation, using the quotient rule, we get:

Simplifying, we get:

Now, if we multiply by 1 in a special way, that is, this way:

and then simplify, we get:

Now, looking back at , let's work on the part. Replacing the function in the one place where
a t appears in the p.d.f. of a random variable with degrees of freedom, we get:

Canceling a few things out we get:

Now, because:

we finally get:

We're almost there! We just need to multiply the two parts together. Doing so, we get:

https://online.stat.psu.edu/stat415/book/export/html/830 7/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

which simplifies to:

over the support , as was to be proved.

Now that we know the p.d.f. of , testing against any of the possible alternative
hypotheses just involves integrating to find the critical value(s) to ensure that , the probability
of a Type I error is small. For example, to test against the alternative , we find
the value such that:

Yikes! Do you have any interest in integrating that function? Well, me neither! That's why we'll
instead use an Table, such as the one we have in Table IX at the back of our textbook.

An Approximate Z-Test for Rho


Okay, the derivation for this hypothesis test is going to be MUCH easier than the derivation for that
last one. That's because we aren't going to derive it at all! We are going to simply state, without
proof, the following theorem.

Theorem

The statistic:

follows an approximate normal distribution with mean and variance

The theorem, therefore, allows us to test the general null hypothesis against any of the
possible alternative hypotheses comparing the test statistic:

to a standard normal distribution.

What? We've looked at no examples yet on this page? Let's take care of that by closing with an
example that utilizes each of the three hypothesis tests we derived above.

https://online.stat.psu.edu/stat415/book/export/html/830 8/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

Example 15-2

An admissions counselor at a large public university was interested in learning whether freshmen
calculus grades are independent of high school math achievement test scores. The sample
correlation coefficient between the mathematics achievement test scores and calculus grades for a
random sample of college freshmen was deemed to be 0.84.

Does this observed sample correlation coefficient suggest, at the level, that the population
of freshmen calculus grades are independent of the population of high school math achievement
test scores?

Answer

The admissions counselor is interested in testing:

against

Using the -statistic we derived, we get:

We reject the null hypothesis if the test statistic is greater than 2.306 or less than −2.306.

-2.306 2.306

https://online.stat.psu.edu/stat415/book/export/html/830 9/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

Because , we reject the null hypothesis in favor of the alternative hypothesis. There
is sufficient evidence at the 0.05 level to conclude that the population of freshmen calculus grades
are not independent of the population of high school math achievement test scores.

Using the R-statistic, with 8 degrees of freedom, Table IX in the back of the book tells us to reject
the null hypothesis if the absolute value of is greater than 0.6319. Because our observed
, we again reject the null hypothesis in favor of the alternative hypothesis. There
is sufficient evidence at the 0.05 level to conclude that freshmen calculus grades are not
independent of high school math achievement test scores.

Using the approximate Z-statistic, we get:

In this case, we reject the null hypothesis if the absolute value of were greater than 1.96. It clearly
is, and so we again reject the null hypothesis in favor of the alternative hypothesis. There is sufficient
evidence at the 0.05 level to conclude that freshmen calculus grades are not independent of high
school math achievement test scores.

15.3 - An Approximate Confidence Interval for Rho

15.3 - An Approximate Confidence Interval for Rho

To develop an approximate confidence interval for , we'll use the normal


approximation for the statistic that we used on the previous page for testing .

Theorem
An approximate confidence interval for is where:

and

Proof
We previously learned that:

https://online.stat.psu.edu/stat415/book/export/html/830 10/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

follows at least approximately a standard normal distribution. So, we can do our usual trick
of starting with a probability statement:

and manipulating the quantity inside the parentheses:

to get ..... can you fill in the details?! ..... the formula for a confidence interval for :

where:

and

as was to be proved!

Example 15-2 (Continued)

An admissions counselor at a large public university was interested in learning whether freshmen
calculus grades are independent of high school math achievement test scores. The sample
correlation coefficient between the mathematics achievement test scores and calculus grades for a
random sample of college freshmen was deemed to be 0.84.

Estimate the population correlation coefficient with 95% confidence.

Answer

Because we are interested in a 95% confidence interval, we use . Therefore, the lower
limit of an approximate 95% confidence interval for is:
https://online.stat.psu.edu/stat415/book/export/html/830 11/12
7/22/23, 1:42 PM Lesson 15: Tests Concerning Regression and Correlation

and the upper limit of an approximate 95% confidence interval for is:

We can be (approximately) 95% confident that the correlation between the population of high
school mathematics achievement test scores and freshmen calculus grades is between 0.447 and
0.961. (Not a particularly useful interval, I might say! It might behoove the admissions counselor to
collect data on a larger sample, so that he or she can obtain a narrower confidence interval.)

Legend
[1] Link

↥ Has Tooltip/Popover

Toggleable Visibility

Source: https://online.stat.psu.edu/stat415/lesson/15

Links:

1. https://online.stat.psu.edu/stat415/sites/stat415/files//lesson43/Lesson43_Minitab01.gif
2. https://online.stat.psu.edu/stat415/sites/stat415/files//lesson43/Lesson43_Minitab02.gif

https://online.stat.psu.edu/stat415/book/export/html/830 12/12
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Lesson 8: Chi-Square Test for Independence


Lesson 8: Chi-Square Test for Independence

Overview
Let's start by recapping what we have discussed thus far in the course and mention what remains:

1. The fundamentals of the sampling distributions for the sample mean and the sample
proportion.
2. We illustrated how these sampling distributions form the basis for estimation (confidence
intervals) and testing for one mean or one proportion.
3. Then we extended the discussion to analyzing situations for two variables; one a response and
the other an explanatory. When both variables were categorical we compared two proportions;
when the explanatory was categorical, and the response was quantitative, we compared two
means.
4. Next, we will take a look at other methods and discuss how they apply to situations where:
both variables are categorical with at least one variable with more than two levels (Chi-
Square Test of Independence)
both variables are quantitative (Linear Regression)
the explanatory variable is categorical with more than two levels, and the response is
quantitative (Analysis of Variance or ANOVA)

In this Lesson, we will examine relationships where both variables are categorical using the Chi-
Square Test of Independence. We will illustrate the connection between the Chi-Square test for
independence and the z-test for two independent proportions in the case where each variable has
only two levels.

Going forward, keep in mind that this Chi-Square test, when significant, only provides statistical
evidence of an association or relationship between the two categorical variables. Do NOT confuse
this result with a correlation which refers to a linear relationship between two quantitative variables
(more on this in the next lesson).

The primary method for displaying the summarization of categorical variables is called a
contingency table. When we have two measurements on our subjects that are both categorical, the
contingency table is sometimes referred to as a two-way table.

This terminology is derived because the summarized table consists of rows and columns (i.e., the
data display goes two ways).

The size of a contingency table is defined by the number of rows times the number of columns
associated with the levels of the two categorical variables. The size is notated , where is the
number of rows of the table and is the number of columns. A cell displays the count for the
intersection of a row and column. Thus the size of a contingency table also gives the number of cells
for that table. For example, if we have a table, then we have cells.

Note! As we will see, these contingency tables usually include a 'total' row and a 'total' column
which represent the marginal totals, i.e., the total count in each row and the total count in each
column. This total row and total column are NOT included in the size of the table. The size refers to

https://online.stat.psu.edu/stat500/book/export/html/477 1/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

the number of levels to the actual categorical variables in the study.

Application

Political Affiliation and Opinion

A random sample of 500 U.S. adults is questioned regarding their political affiliation and opinion on
a tax reform bill. The results of this survey are summarized in the following contingency table:

Favor Indifferent Opposed Total

Democrat 138 83 64 285

Republican 64 67 84 215

Total 202 150 148 500

The size of this table is $2\times 3$ and NOT $3\times 4$. There are only two rows of observed data
for Party Affiliation and three columns of observed data for their Opinion. We define the Party
Affiliation as the explanatory variable and Opinion as the response because it is more natural to
analyze how one's opinion is shaped by their party affiliation than the other way around.

From here, we would want to determine if an association (relationship) exists between Political Party
Affiliation and Opinion on Tax Reform Bill. That is, are the two variables dependent. We'll discuss in
the next section how to approach this.

Objectives
Upon successful completion of this lesson, you should be able to:

Determine when to use the Chi-Square test for independence.


Compute expected counts for a table assuming independence.
Calculate the Chi-Square test statistic given a contingency table by hand and with technology.
Conduct the Chi-Square test for independence.
Explain how the Chi-Square test for independence is related to the hypothesis test for two
independent proportions.
Calculate and interpret risk and relative risk.

8.1 - The Chi-Square Test of Independence

8.1 - The Chi-Square Test of Independence

https://online.stat.psu.edu/stat500/book/export/html/477 2/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

How do we test the independence of two categorical variables? It will be done using the Chi-Square
Test of Independence.

As with all prior statistical tests we need to define null and alternative hypotheses. Also, as we have
learned, the null hypothesis is what is assumed to be true until we have evidence to go against it. In
this lesson, we are interested in researching if two categorical variables are related or associated (i.e.,
dependent). Therefore, until we have evidence to suggest that they are, we must assume that they
are not. This is the motivation behind the hypothesis for the Chi-Square Test of Independence:

: In the population, the two categorical variables are independent.


: In the population, the two categorical variables are dependent.

Note! There are several ways to phrase these hypotheses. Instead of using the words "independent"
and "dependent" one could say "there is no relationship between the two categorical variables"
versus "there is a relationship between the two categorical variables." Or "there is no association
between the two categorical variables" versus "there is an association between the two variables."
The important part is that the null hypothesis refers to the two categorical variables not being
related while the alternative is trying to show that they are related.

Once we have gathered our data, we summarize the data in the two-way contingency table. This
table represents the observed counts and is called the Observed Counts Table or simply the
Observed Table. The contingency table on the introduction page to this lesson represented the
observed counts of the party affiliation and opinion for those surveyed.

The question becomes, "How would this table look if the two variables were not related?" That is,
under the null hypothesis that the two variables are independent, what would we expect our data to
look like?

Consider the following table:

Success Failure Total


Group 1 A B A+B

Group 2 C D C+D
Total A+C B+D A+B+C+D

The total count is . Let's focus on one cell, say Group 1 and Success with observed
count A. If we go back to our probability lesson, let denote the event 'Group 1' and denote the
event 'Success.' Then,

and .

Recall that if two events are independent, then their intersection is the product of their respective
probabilities. In other words, if and are independent, then...

https://online.stat.psu.edu/stat500/book/export/html/477 3/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

If we considered counts instead of probabilities, then we get the count by multiplying the probability
by the total count. In other words...

This is the count we would expect to see if the two variables were independent (i.e. assuming the
null hypothesis is true).

Expected Cell Count

The expected count for each cell under the null hypothesis is:

Example 8-1: Political Affiliation and Opinion


To demonstrate, we will use the Party Affiliation and Opinion on Tax Reform example.

Observed Table:

favor indifferent opposed total


democrat 138 83 64 285

republican 64 67 84 215
total 202 150 148 500

Find the expected counts for all of the cells.

Answer

We need to find what is called the Expected Counts Table or simply the Expected Table. This table
displays what the counts would be for our sample data if there were no association between the
variables.

Calculating Expected Counts from Observed Counts

https://online.stat.psu.edu/stat500/book/export/html/477 4/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

favor indifferent opposed total

democrat 285

republican 215

total 202 150 148 500

Chi-Square Test Statistic


To better understand what these expected counts represent, first recall that the expected counts
table is designed to reflect what the sample data counts would be if the two variables were
independent. Taking what we know of independent events, we would be saying that the sample
counts should show similarity in opinions of tax reform between democrats and republicans. If you
find the proportion of each cell by taking a cell's expected count divided by its row total, you will
discover that in the expected table each opinion proportion is the same for democrats and
republicans. That is, from the expected counts, 0.404 of the democrats and 0.404 of the republicans
favor the bill; 0.3 of the democrats and 0.3 of the republicans are indifferent; and 0.296 of the
democrats and 0.296 of the republicans are opposed.

The statistical question becomes, "Are the observed counts so different from the expected counts
that we can conclude a relationship exists between the two variables?" To conduct this test we
compute a Chi-Square test statistic where we compare each cell's observed count to its respective
expected count.

In a summary table, we have cells. Let denote the observed counts for
each cell and denote the respective expected counts for each cell.

Chi-Square Test Statistic

The Chi-Square test statistic is calculated as follows:

Under the null hypothesis and certain conditions (discussed below), the test statistic follows a Chi-
Square distribution with degrees of freedom equal to , where is the number of rows
and is the number of columns. We leave out the mathematical details to show why this test statistic
is used and why it follows a Chi-Square distribution.

As we have done with other statistical tests, we make our decision by either comparing the value of
the test statistic to a critical value (rejection region approach) or by finding the probability of getting
this test statistic value or one more extreme (p-value approach).

The critical value for our Chi-Square test is with degree of freedom = , while the p-
value is found by with degrees of freedom = .

https://online.stat.psu.edu/stat500/book/export/html/477 5/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Example 8-1 Cont'd: Chi-Square


Let's apply the Chi-Square Test of Independence to our example where we have a random sample of
500 U.S. adults who are questioned regarding their political affiliation and opinion on a tax reform
bill. We will test if the political affiliation and their opinion on a tax reform bill are dependent at a 5%
level of significance. Calculate the test statistic.

Answer

By Hand [1]
Using Minitab [2]

1. The contingency table (political_affiliation.csv) is given below. Each cell contains the observed
count and the expected count in parentheses. For example, there were 138 democrats who
favored the tax bill. The expected count under the null hypothesis is 115.14. Therefore, the cell is
displayed as 138 (115.14). [3]

favor indifferent opposed total

democrat 138 (115.14) 83 (85.5) 64 (84.36) 285


republican 64 (86.86) 67 (64.50) 84 (63.64) 215

total 202 150 148 500

Calculating the test statistic by hand:

...with degrees for freedom equal to .

Note! We do not expect you to calculate the critical value or the p-value by hand. The p-value
can be found using software.
2. Let's apply the Chi-Square Test of Independence to our example where we have a random
sample of 500 U.S. adults who are questioned regarding their political affiliation and opinion on
a tax reform bill. Test if the political affiliation and their opinion on a tax reform bill are
dependent at a 5% level of significance.

Minitab: Chi-Square Test of Independence

To perform the Chi-Square test in Minitab...

1. Choose Stat > Tables > Chi-Square Test for Association


2. If you have summarized data (i.e., observed count) from the drop-down box 'Summarized
data in a two-way table.' Select and enter the columns that contain the observed counts,
otherwise, if you have the raw data use 'Raw data' (categorical variables). Note that if using

https://online.stat.psu.edu/stat500/book/export/html/477 6/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

the raw data your data will need to consist of two columns: one with the explanatory
variable data (goes in the 'row' field) and the response variable data (goes in the 'column'
field).
3. Labeling (Optional) When using the summarized data you can label the rows and columns
if you have the variable labels in columns of the worksheet. For example, if we have a
column with the two political party affiliations and a column with the three opinion choices
we could use these columns to label the output.
4. Click the Statistics tab. Keep checked the four boxes already checked, but also check
the box for 'Each cell's contribution to the chi-square.' Click OK.
5. Click OK.

Note! If you have the observed counts in a table, you can copy/paste them into Minitab. For
instance, you can copy the entire observed counts table (excluding the totals!) for our example
and paste these into Minitab starting with the first empty cell of a column.

The following is the Minitab output for this example.

Cell Contents: Count, Expected count, Contribution to Chi-Square

favor indifferent opposed All

138 83 64

1 115.14 85.50 84.36 285

4.5836 0.0731 4.9138

64 67 84

2 86.86 64.50 63.64 215

6.0163 0.0969 6.5137

All 202 150 148 500

Pearson Chi-Sq = 4.5386 + 0.073 + 4.914 + 6.016 + 0.097 + 6.5137 = 22.152 DF = 2, P-Value =
0.000

Likelihood Ratio Chi-Square

(Ignore the Fisher's p-value! The p-value highlighted above is calculated using the methods
we learned in this lesson. More specifically, the chi-square we learned is referred to as the
Pearson Chi-Square. The Fisher's test uses a different method than what we explained in

https://online.stat.psu.edu/stat500/book/export/html/477 7/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

this lesson to calculate a test statistic and p-value. This method incorporates a log of the
ratio of observed to expected values. It's a different technique that is more complicated to
do by-hand. Minitab automatically includes both results in its output.)

The Chi-Square test statistic is 22.152 and calculated by summing all the individual cell's Chi-
Square contributions:

The p-value is found by with degrees of freedom = .

Minitab calculates this p-value to be less than 0.001 and reports it as 0.000. Given this p-value of
0.000 is less than the alpha of 0.05, we reject the null hypothesis that political affiliation and their
opinion on a tax reform bill are independent. We conclude that there is evidence that the two
variables are dependent (i.e., that there is an association between the two variables).

Conditions for Using the Chi-Square Test


Exercise caution when there are small expected counts. Minitab will give a count of the number of
cells that have expected frequencies less than five. Some statisticians hesitate to use the Chi-Square
test if more than 20% of the cells have expected frequencies below five, especially if the p-value is
small and these cells give a large contribution to the total Chi-Square value.

Example 8-2: Tire Quality


The operations manager of a company that manufactures tires wants to determine whether there are
any differences in the quality of work among the three daily shifts. She randomly selects 496 tires
and carefully inspects them. Each tire is either classified as perfect, satisfactory, or defective, and the
shift that produced it is also recorded. The two categorical variables of interest are the shift and
condition of the tire produced. The data (shift_quality.txt) can be summarized by the accompanying
two-way table. Does the data provide sufficient evidence at the 5% significance level to infer that
there are differences in quality among the three shifts? [4]

Perfect Satisfactory Defective Total

Shift 1 106 124 1 231

Shift 2 67 85 1 153

Shift 3 37 72 3 112

Total 210 281 5 496

Answer
Minitab output:
https://online.stat.psu.edu/stat500/book/export/html/477 8/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Chi-Square Test

C1 C2 C3 Total

106 124 1
1 231
97.80 130.87 2.33

67 85 1
2 153
64.78 86.68 1.54

37 72 3
3 112
47.42 63.45 1.13

Total 210 281 5 496

Chi-Sq = 8.647 DF = 4, P-Value = 0.071

Note that there are 3 cells with expected counts less than 5.0.

In the above example, we don't have a significant result at a 5% significance level since the p-value
(0.071) is greater than 0.05. Even if we did have a significant result, we still could not trust the result,
because there are 3 (33.3% of) cells with expected counts < 5.0

Caution!

Sometimes researchers will categorize quantitative data (e.g., take height measurements and
categorize as 'below average,' 'average,' and 'above average.') Doing so results in a loss of
information - one cannot do the reverse of taking the categories and reproducing the raw
quantitative measurements. Instead of categorizing, the data should be analyzed using quantitative
methods.

Try it!
A food services manager for a baseball park wants to know if there is a relationship between gender
(male or female) and the preferred condiment on a hot dog. The following table summarizes the
results. Test the hypothesis with a significance level of 10%.

https://online.stat.psu.edu/stat500/book/export/html/477 9/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Condiment

Ketchup Mustard Relish Total


Male 15 23 10 48
Gender
Female 25 19 8 52
Total 40 42 18 100

Answer

The hypotheses are:

: Gender and condiments are independent


: Gender and condiments are not independent

We need to expected counts table:

Condiment

Ketchup Mustard Relish Total


Male 15 (19.2) 23 (20.16) 10 (8.64) 48
Gender
Female 25 (20.8) 19 (21.84) 8 (9.36) 52
Total 40 42 18 100

None of the expected counts in the table are less than 5. Therefore, we can proceed with the Chi-
Square test.

The test statistic is:

The p-value is found by with (3-1)(2-1)=2 degrees of freedom.


Using a table or software, we find the p-value to be 0.2288.

With a p-value greater than 10%, we can conclude that there is not enough evidence in the data to
suggest that gender and preferred condiment are related.

8.2 - The 2x2 Table: Test of 2 Independent Proportions

8.2 - The 2x2 Table: Test of 2 Independent Proportions

Say we have a study of two categorical variables each with only two levels. One of the response
levels is considered the "success" response and the other the "failure" response. A general 2 × 2
table of the observed counts would be as follows:

https://online.stat.psu.edu/stat500/book/export/html/477 10/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Success Failure Total

Group 1 A B A+B

Group 2 C D C+D

The observed counts in this table represent the following proportions:

Success Failure Total

Group 1 A+B

Group 2 C+D

Recall from our Z-test of two proportions that our null hypothesis is that the two population
proportions, and , were assumed equal while the two-sided alternative hypothesis was that
they were not equal.

This null hypothesis would be analogous to the two groups being independent.

Also, if the two success proportions are equal, then the two failure proportions would also be equal.
Note as well that with our Z-test the conditions were that the number of successes and failures for
each group was at least 5. That equates to the Chi-square conditions that all expected cells in a 2 × 2
table be at least 5. (Remember at least 80% of all cells need an expected count of at least 5. With
80% of 4 equal to 3.2 this means all four cells must satisfy the condition).

When we run a Chi-square test of independence on a 2 × 2 table, the resulting Chi-square test
statistic would be equal to the square of the Z-test statistic (i.e., ) from the Z-test of two
independent proportions.

Application

Political Affiliation and Opinion

Consider the following example where we form a 2 × 2 for the Political Party and Opinion by only
considering the Favor and Opposed responses:

favor oppose Total

democrat 138 64 202

https://online.stat.psu.edu/stat500/book/export/html/477 11/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

favor oppose Total

republican 64 84 148

Total 202 148 350

The Chi-square test produces a test statistic of 22.00 with a p-value of 0.00

The Z-test comparing the two sample proportions of minus


results in a Z-test statistic of with p-value of .

If we square the Z-test statistic, we get or with rounding error.

Try it!
The condiments and gender data were condensed to consider gender and either mustard or
ketchup. The manager wants to know if the proportion of males that prefer ketchup is the same as
the proportion of females that prefer ketchup. Test the hypothesis two ways (1) using the Chi-square
test and (2) using the z-test for independence with a significance level of 10%. Show how the two
test statistics are related and compare the p-values.

Condiment

Ketchup Mustard Total


Male 15 23 38
Gender
Female 25 19 44

Total 40 42 82
Answer

Z-test for two proportions

The hypotheses are:

Let males be denoted as sample one and females as sample two. Using the table, we have:

and

and

The conditions are satisfied for this test (verify for extra practice).

To calculate the test statistic, we need:

https://online.stat.psu.edu/stat500/book/export/html/477 12/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

The test statistic is:

The p-value is .

The p-value is greater than our significance level. Therefore, there is not enough evidence in the
data to suggest that the proportion of males that prefer ketchup is different than the proportion of
females that prefer ketchup.

Chi-square Test for independence

The expected count table is:

Condiment
Ketchup Mustard Total
Male 15 (18.537) 23 (19.463) 38
Gender
Female 25 (21.463) 19 (22.537) 44
Total 40 42 82

There are no expected counts less than 5. The test statistic is:

With 1 degree of freedom, the p-value is 0.1168. The p-value is greater than our significance value.
Therefore, there is not enough evidence to suggest that gender and condiments (ketchup or
mustard) are related.

Comparison

The p-values would be the same without rounding errors (0.1172 vs 0.1168). The z-statistic is
-1.567. The square of this value is 2.455 which is what we have (rounded) for the chi-square
statistic. The conclusions are the same.

8.3 - Risk, Relative Risk and Odds

https://online.stat.psu.edu/stat500/book/export/html/477 13/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

8.3 - Risk, Relative Risk and Odds

Risk
In this section, we will introduce some other measures we can find using a contingency table. One of
the most straightforward measures to find is the risk of any given event.

Risk
The probability that an event will occur.

In simple terms, a risk for a group is the same as the proportion of "success" for a particular group.

Relative Risk
Have you ever heard a doctor tell you or a family member something similar to the following: "If you
do not lose weight or get your cholesterol under control you are about five times more likely to
suffer a heart attack than if you had these numbers in the normal range." If so, how alarmed should
one be? "Five times" sounds alarming!

First off, this "five times" represents what is called relative risk.

Relative risk
Relative risk is a ratio of the risks of two groups.

In the example described above, it would be the risk of heart attack for a person in their current
condition compared to the risk of heart attack if that person were in the normal ranges. However, to
truly interpret the severity of a relative risk we have to know the baseline risk.

Baseline Risk
The baseline risk is the denominator of relative risk, i.e., the risk of the group being compared
to.

In our example, this would be the risk of heart attack for the normal range. If this baseline risk is
high, then a relative risk of 5 would be alarming; if the baseline risk is small, then a relative risk of 5
may not be too serious.

For instance, if the risk of a heart attack for someone in the normal range was 1 out of 10, then the
risk of a heart attack for a person with the above average numbers would be five times this or 5 out
of 10. That is, the person would have roughly a 50/50 chance of suffering a heart attack if they didn't
get their weight and cholesterol in check. However, if the risk of a heart attack for the normal range
group was 1 out of 500, then the risk of a heart attack for a person with above average numbers
would be 5 out of 500 or 0.01. The person would have about a 1% chance of a heart attack if they
didn't improve their health. In both cases the relative risk was 5, but with entirely different levels of
impact. Please note this example is not meant to be interpreted that taking care of your health is not
important!!!

https://online.stat.psu.edu/stat500/book/export/html/477 14/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Another measure we can find is odds.

Odds
Odds is a ratio of the number of “success” over the number of “failures.” It can be reported as a
fraction or as “number of success: number of failures.”

Example 8-1 Cont'd: Risk and Relative Risk


If we return to our Political Party and Opinion survey data, find the risk for either party favoring the
tax bill and use these risks to find and interpret a relative risk. Also, find the odds of a democrat
favoring the bill.

favor indifferent opposed total


democrat 138 83 64 285
republican 64 67 84 215

total 202 150 148 500


Answer

From the table, the risk of democrats favoring the bill:

The risk of republicans favoring the bill:

The relative risk that democrats favor the bill compared to republicans:

We would interpret this relative risk as "Democrats are about 1.6 times more likely than Republicans
to favor the bill (i.e.: Democrats are 60% more likely to support the bill than Republicans)."

The odds of a democrat favoring the tax bill is or .

Try it!
Consider again our previous example comparing gender and preferred condiments. The summary
table is shown below for convenience.

Condiment
Ketchup Mustard Total
Male 15 23 38
Gender
Female 25 19 44
Total 40 42 82

Find the risk of either gender preferring ketchup and use those risks to find and interpret the relative
risk.
https://online.stat.psu.edu/stat500/book/export/html/477 15/16
8/13/23, 9:02 PM Lesson 8: Chi-Square Test for Independence

Answer

The risk of males preferring ketchup is .

The risk of females preferring ketchup is .

The relative risk that females prefer ketchup compared to males is:

We can interpret the relative risk as...

“Females are about 1.435 times more likely to prefer ketchup on hot dogs than males.”

8.4 - Lesson 8 Summary

8.4 - Lesson 8 Summary

In this Lesson, we learned how to calculate counts under the assumption that the two categorical
variables are independent. We then used these expected counts to test the hypotheses:

The two variables are independent.


The two variables are not independent.

We demonstrated how this test relates to our test for two proportions when the alternative is two-
sided.

We also introduced the terms risk and relative risk. The calculation, as well as the interpretation, is
discussed.

In the next Lesson, we will consider the case where there are two quantitative variables (quantitative
response and quantitative explanatory variable). We will explore how to determine if the variables
have a significant linear relationship.

Legend
[1] Link

↥ Has Tooltip/Popover

Toggleable Visibility

Source: https://online.stat.psu.edu/stat500/lesson/8

Links:

1. https://online.stat.psu.edu/stat500#tablist-cke_4-tab-pane-1
2. https://online.stat.psu.edu/stat500#tablist-cke_4-tab-pane-2
3. https://online.stat.psu.edu/stat500/sites/stat500/files/data/political_affiliation.csv
4. https://online.stat.psu.edu/stat500/sites/stat500/files/data/shift_quality.txt

https://online.stat.psu.edu/stat500/book/export/html/477 16/16

You might also like