0% found this document useful (0 votes)
30 views4 pages

Ombc 106 Notes u11

Uploaded by

anjnaprohike26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views4 pages

Ombc 106 Notes u11

Uploaded by

anjnaprohike26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Unit 11: Chi-square analysis

1.0 Introduction

Some of the important properties of the chi-square distribution are:


• Unlike the normal and t distribution, the chi-square distribution is not symmetric.
• The values of a chi-square are greater than or equal to zero.
• The shape of a chi-square distribution depends upon the degrees of freedom. With the increase in degrees
of freedom, the distribution tends to normal.
• There are many applications of a chi-square test. Those mentioned below will be discussed in this unit:
• A chi-square test for the goodness of fit
• A chi-square test for the independence of variables
• A chi-square test for the equality of more than two population proportions.

2.0 A CHI-SQUARE TEST FOR THE GOODNESS OF FIT

What is a Chi-Square Goodness-of-Fit Test? - A Chi-Square Goodness-of-Fit test is a statistical test used to
determine whether a sample data fits a specific distribution. It compares the observed frequencies of data with
the expected frequencies from a theoretical distribution.

When to Use:
• Categorical Data: When your data is categorical (e.g., colors, brands, ratings).
• Theoretical Distribution: You have a specific theoretical distribution in mind (e.g., uniform, normal,
binomial, Poisson).

Hypotheses:
• Null Hypothesis (H₀): The observed frequencies fit the expected frequencies.
• Alternative Hypothesis (H₁): The observed frequencies do not fit the expected frequencies.

Test Statistic:
χ²Ê=ÊΣÊ[(OÊ- E)²Ê/ÊE]
where:
 χ²: Chi-square test statistic
 O: Observed frequency
 E: Expected frequency
 Degrees of Freedom (df) = k - 1, (where k is the number of categories / cells).

Decision Rule:
• Critical Value Approach: Compare the calculated χ² value with the critical χ² value from the chi-square
distribution table based on the significance level (α) and degrees of freedom.
• p-value Approach: Calculate the p-value associated with the calculated χ² value and compare it to the
significance level (α).

Example:
Suppose we have a six-sided die and we want to test whether it is fair. We roll the die 60 times and
observe the following frequencies:
Face Observed Frequency (O)
1 12
2 10
3 8
4 15
5 9
6 6
Expected Frequency (E) for each face: 60/6 = 10

Calculating the Chi-Square Test Statistic:

χ² = [(12-10)²/10] + [(10-10)²/10] + ... + [(6-10)²/10] ≈ 4.4

Degrees of Freedom: df = 6 - 1 = 5

Decision:
 Using a significance level of α = 0.05, the critical value for a chi-square distribution with 5 degrees of
freedom is approximately 11.07.
 Since the calculated χ² value (4.4) is less than the critical value, we fail to reject the null hypothesis.

Conclusion: There is not enough evidence to conclude that the die is unfair. The observed frequencies are
consistent with the expected frequencies of a fair die.
Note: The chi-square goodness-of-fit test is sensitive to sample size. A larger sample size can increase the power
of the test to detect small deviations from the expected distribution.

3.0 A CHI-SQUARE TEST FOR THE INDEPENDENCE OF VARIABLES

What is a Chi-Square Test for Independence? - A Chi-Square Test for Independence is a statistical test used to
determine whether there is a significant association between two categorical variables. It helps us understand if
the variables are independent or dependent.

When to Use:
• Categorical Data: Both variables should be categorical.
• Contingency Table: The data is often presented in a contingency table.

Hypotheses:
• Null Hypothesis (H₀): The two variables are independent.
• Alternative Hypothesis (H₁): The two variables are dependent.

Test Statistic:
χ²Ê=ÊΣÊ[(OÊ- E)²Ê/ÊE]
where:
 χ²: Chi-square test statistic
 O: Observed frequency
 E: Expected frequency

Expected Frequency: EÊ=Ê(RowÊTotalÊ*ÊColumnÊTotal)Ê/ÊGrandÊTotal

Degrees of Freedom (df) = (numberÊofÊrowsÊ- 1)Ê*Ê(numberÊofÊcolumnsÊ- 1)

Decision Rule:
 Critical Value Approach: Compare the calculated χ² value with the critical χ² value from the chi-
square distribution table based on the significance level (α) and degrees of freedom.
 p-value Approach: Calculate the p-value associated with the calculated χ² value and compare it to the
significance level (α).
Example:
Suppose we want to test whether there is a relationship between gender and preference for a particular
brand of soda. We collect data from 100 people and organize it into a contingency table:
Gender/Soda Preference Brand A Brand B Total
Male 25 25 50
Female 25 25 50
Total 50 50 100
Expected Frequencies:
Gender/Soda Preference Brand A Brand B Total
Male 25 25 50
Female 25 25 50
Total 50 50 100

Calculating the Chi-Square Test Statistic:


χ² = [(25-25)²/25] + [(25-25)²/25] + ... + [(25-25)²/25] = 0

Degrees of Freedom: df = (2-1) * (2-1) = 1

Decision:
 Using a significance level of α = 0.05, the critical value for a chi-square distribution with 1 degree of
freedom is approximately 3.84.
 Since the calculated χ² value (0) is less than the critical value, we fail to reject the null hypothesis.

Conclusion: There is not enough evidence to conclude that there is a relationship between gender and
soda preference. The two variables appear to be independent.

Note: The chi-square test for independence is sensitive to sample size and the expected cell frequencies. It's
important to ensure that the expected frequencies are not too small, as this can affect the accuracy of the test.

4.0 A CHI-SQUARE TEST FOR THE EQUALITY OF MORE THAN TWO POPULATION PROPORTIONS

This statistical test is used to determine whether the proportions of a categorical variable are equal across
multiple populations. It's a versatile tool that finds applications in various fields, from social sciences to medical
research.

Null and Alternative Hypotheses:


• Null Hypothesis (H₀): The proportions of the categorical variable are the same across all populations.
• Alternative Hypothesis (H₁): At least one population proportion is different from the others.

Assumptions:
1. Independence: Observations within and between groups are independent.
2. Sample Size: Expected cell counts should be at least 5 for the chi-square approximation to be valid.

Steps Involved:
1. Create a Contingency Table:
○ Organize the data into a contingency table with rows representing the categories of the categorical
variable and columns representing the different populations.
2. Calculate Expected Frequencies:
○ For each cell in the table, calculate the expected frequency under the null hypothesis. This is done by
multiplying the row total by the column total and dividing by the grand total.
3. Calculate the Chi-Square Test Statistic:
○ Use the following formula:
χ²Ê=ÊΣÊ[(OÊ- E)²Ê/ÊE]

where:
 χ²: Chi-square test statistic
 O: Observed frequency in a cell
 E: Expected frequency in a cell
 Σ: Summation over all cells in the table
4. Determine Degrees of Freedom:
○ Calculate the degrees of freedom (df)Ê=Ê(rÊ- 1)Ê*Ê(cÊ- 1)
where:
 r: Number of rows in the contingency table
 c: Number of columns in the contingency table
5. Find the Critical Value or p-value:
○ Critical Value Approach:
 Use a chi-square distribution table to find the critical value corresponding to the desired
significance level (α) and degrees of freedom.
○ p-value Approach:
 Use statistical software or a chi-square calculator to find the p-value associated with the
calculated chi-square test statistic and degrees of freedom.
6. Make a Decision:
○ Critical Value Approach:
 If the calculated chi-square test statistic is greater than the critical value, reject the null
hypothesis.
○ p-value Approach:
 If the p-value is less than the significance level (α), reject the null hypothesis.
Interpretation:
• Rejecting the null hypothesis indicates that there is evidence to suggest that at least one population
proportion is different from the others.
• Failing to reject the null hypothesis suggests that there is not enough evidence to conclude that the
population proportions are different.

Additional Considerations:
• Post-hoc Tests: If the null hypothesis is rejected, post-hoc tests can be used to determine which specific
pairs of populations have significantly different proportions.
• Assumptions: It's important to check the assumptions of independence and expected cell sizes before
conducting the chi-square test.

By following these steps and considering the assumptions, you can effectively use the chi-square test to compare
proportions across multiple populations.

You might also like