0% found this document useful (0 votes)
148 views17 pages

L19 - Chi Square Test 1

We studied techniques for analyzing categorical data including the chi-square goodness-of-fit test. This test analyzes probabilities of multinomial distribution trials along a single dimension by comparing expected frequencies from a theoretical population distribution to observed frequencies from a sample. For example, it can test if customer satisfaction ratings at a supermarket match national survey results. The test calculates a chi-square statistic and compares it to a critical value, rejecting the null hypothesis of a match if the statistic exceeds the critical value. Uneven milk sales distributions found using this test mean production managers must plan for varying demand levels across months.

Uploaded by

quang hoa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views17 pages

L19 - Chi Square Test 1

We studied techniques for analyzing categorical data including the chi-square goodness-of-fit test. This test analyzes probabilities of multinomial distribution trials along a single dimension by comparing expected frequencies from a theoretical population distribution to observed frequencies from a sample. For example, it can test if customer satisfaction ratings at a supermarket match national survey results. The test calculates a chi-square statistic and compares it to a critical value, rejecting the null hypothesis of a match if the statistic exceeds the critical value. Uneven milk sales distributions found using this test mean production managers must plan for varying demand levels across months.

Uploaded by

quang hoa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

The techniques presented here for analyzing categorical data, the chi-

square goodness-of-fit test and the chi-square test of independence, are


an outgrowth of the binomial distribution and the inferential techniques
for analyzing population proportions
Binomial distribution in which only two possible outcomes could occur
on a single trial in an experiment. An extension of the binomial
distribution is a multinomial distribution in which more than two
possible outcomes can occur in a single trial.

The chi-square goodness-of-fit test is used to analyze probabilities of


multinomial distribution trials along a single dimension.
For example, if the variable being studied is economic class with three
possible outcomes of lower income class, middle income class, and
upper income class, the single dimension is economic class and the
three possible outcomes are the three classes. On each trial, one and
only one of the outcomes can occur. In other words, a family unit must
be classified either as lower income class, middle income class, or
upper income class and cannot be in more than one class.
The chi-square goodness-of-fit test compares the expected, or theoretical, frequencies of
categories from a population distribution to the observed, or actual, frequencies from a
distribution to determine whether there is a difference between what was expected and what was
observed.

For example, airline industry officials might theorize that the ages of airline ticket purchasers are
distributed in a particular way. To validate or reject this expected distribution, an actual sample
of ticket purchaser ages can be gathered randomly, and the observed results can be compared to
the expected results with the chi-square goodness-of-fit test.

This test also can be used to determine whether the observed arrivals at teller windows at a Bank
are Poisson distributed, as might be expected.

In the paper industry, manufacturers can use the chi-square goodness-of-fit test to determine
whether the demand for paper follows a uniform distribution throughout the year.

3
We studied the binomial distribution in which only two possible outcomes could
occur on a single trial in an experiment. An extension of the binomial distribution is a
multinomial distribution in which more than two possible outcomes can occur in a
single trial.

The chi-square goodness-of-fit test is used to analyze probabilities of multinomial


distribution trials along a single dimension. For example, if the variable being
studied is economic class with three possible outcomes of lower income class, middle
income class, and upper income class, the single dimension is economic class and the
three possible outcomes are the three classes. On each trial, one and only one of the
outcomes can occur. In other words, a family unit must be classified either as lower
income class, middle income class, or upper income class and cannot be in more than
one class.

4
As a rule, if a uniform distribution is being used as the expected distribution or if an
expected distribution of values is given, k - 1 degrees of freedom are used in the test.
In testing to determine whether an observed distribution is Poisson, the degrees of
freedom are k - 2 because an additional degree of freedom is lost in estimating λ.
In testing to determine whether an observed distribution is normal, the degrees of
freedom are k - 3 because two additional degrees of freedom are lost in estimating
both “µ” and “ σ ” from the observed sample data.
5
How can the chi-square goodness-of-fit test be applied to business situations? One survey of
U.S. consumers conducted by The Wall Street Journal and NBC News asked the question:

“In general, how would you rate the level of service that American businesses provide?”

The distribution of responses to this question was as follows:

• Excellent 8%
• Pretty good 47%
• Only fair 34%
• Poor 11%

6
Suppose a store manager wants to find out whether the results of this
consumer survey apply to customers of supermarkets in her city. To do
so, she interviews 207 randomly selected consumers as they leave
supermarkets in various parts of the city. She asks the customers how
they would rate the level of service at the supermarket from which they
had just exited.

7
HYPOTHESIZE:
STEP 1. The hypotheses for this example follows.
Ho: The observed distribution is the same as the expected distribution.
Ha: The observed distribution is not the same as the expected distribution.

STEP 2. The statistical test being used is

STEP 3. Let α = .05.

STEP 4. Chi-square goodness-of-fit tests are one tailed because a chi-square of zero
indicates perfect agreement between distributions. Any deviation from zero difference
occurs in the positive direction only because chi-square is determined by a sum of squared
values and can never be negative.

8
With four categories in this example (excellent, pretty good, only fair, and poor), k = 4. The degrees of
freedom are k - 1 because the expected distribution is given: k - 1 = 4 - 1 = 3.

For α = .05 and df = 3, the critical chisquare value is = 7.81. After the data are analyzed, an observed chi-
square greater than 7.8147 must be computed in order to reject the null hypothesis.

9
STEP 5. The observed values gathered in the sample data are to 207.
Thus n = 207. The expected proportions are given, but the expected
frequencies must be calculated by multiplying the expected proportions
by the sample total of the observed frequencies, as shown in next slide.

10
STEP 5. The observed values gathered in the sample data are to 207. Thus n = 207. The
expected proportions are given, but the expected frequencies must be calculated by multiplying
the expected proportions by the sample total of the observed frequencies, as shown in next slide.

11
STEP 6. The chi-square goodness-of-fit can then be calculated, as shown infigure.

STEP 7. Because the observed value of chi-square of 6.25 is not greater than the critical table value of 7.8147, the store
manager will not reject the null hypothesis.

STEP 8. Thus the data gathered in the sample of 207 supermarket shoppers indicate that the distribution of responses of
supermarket shoppers in the manager’s city is not significantly different from the distribution of responses to the national
survey. The store manager may conclude that her customers do not appear to have attitudes different from those people
who took the survey.

12
Dairies would like to know whether the sales of milk are distributed uniformly over a year so
they can plan for milk production and storage. A uniform distribution means that the
frequencies are the same in all categories. In this situation, the producers are attempting to
determine whether the amounts of milk sold are the same for each month of the year. They
ascertain the number of gallons of milk sold by sampling one large supermarket each month
during a year, obtaining the following data. Use α = .01 to test whether the data fit a uniform
distribution.

13
Solution

STEP 1. The hypotheses follow.

H0: The monthly figures for milk sales are uniformly distributed.
Ha: The monthly figures for milk sales are not uniformly distributed.

STEP 2. The statistical test used is

( fo - f e )2
χ2
S TA T
= å fe
all cells

14
STEP 3. Alpha is 0.01.

STEP 4. There are 12 categories and a uniform distribution is the expected distribution, so the
degrees of freedom are k - 1 = 12 - 1 = 11. For α = .01, the critical value is 2 .01,11 = 24.725.
An observed chi-square value of more than 24.725 must be obtained to reject the null
hypothesis.

STEP 5. The data are given in the preceding table.

STEP 6. The first step in calculating the test statistic is to determine the expected frequencies.
The total for the expected frequencies must equal the total for the observed frequencies
(18,447). If the frequencies are uniformly distributed, the same number of gallons of milk is
expected to be sold each month. The expected monthly figure is 18447/12 = 1537.2 .
The following table shows the observed frequencies, the expected frequencies, and the chi-square
calculations for this problem.

16
STEP 7. The observed Chi-square value of 74.37 is greater than the critical table value of 24.725, so the
decision is to reject the null hypothesis. This problem provides enough evidence to indicate that the
distribution of milk sales is not uniform.

BUSINESS IMPLICATIONS:
STEP 8. Because retail milk demand is not uniformly distributed, sales and production managers need to
generate a production plan to cope with uneven demand. In times of heavy demand, more milk will
need to be processed or on reserve; in times of less demand, provision for milk storage or for a reduction
in the purchase of milk from dairy farmers will be necessary.

17

You might also like