0% found this document useful (0 votes)
10 views

Lecture 41 - Hypothesis Testing - Chi-Square

Uploaded by

ayeshaabid043
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lecture 41 - Hypothesis Testing - Chi-Square

Uploaded by

ayeshaabid043
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Inferential Statistics

&
Applied Probability

Course Code: DS221

Credits: 3

Instructor: Muhammad Sajid Ali

1
Motivation for Chi-square
 In hypothesis testing with t-tests and z-tests, we
have been comparing sample means to population
means.
 Decision making based on errors (Type I and Type
II) and significance.

 What if you have categorical data and you want to


test that?
– Whether gender is related to voting preference?
– whether smoking is related to lung cancer?
– Semester grades are related to gender?

 To answer this, we have Chi-square distribution 2


Chi-square Distribution
 If Zi’s are independent variables of standard normal
distribution.

 has a Chi-square distribution with degree of


freedom m, X ∼ X2(m)

3
Chi-square Test
Chi-square is mainly used in two scenarios.

to see if an observed frequency distribution fits a specific


expected distribution. Also known as goodness of fit

to test whether two categorical variables are independent


or if there is some association between them

We can test the goodness of fit for a model using a statistic


C against chi-square distribution, where

4
Chi-square – Degree of Freedom
The degree of freedom for the chi-square distribution for a r
by c table is

(r – 1) x (c – 1) where r>1 and c>1

5
Chi-square Test – Example
Suppose that after losing a large amount of money, an
unlucky gambler questions whether the game was fair and
the die was really unbiased.

The last 90 tosses of this die gave the following results

6
Chi-square Test – Example
Step1 – Define the hypothesis:

Null Hypothesis (H₀): the die is fair,


– meaning each number (1 through 6) has an equal probability of
appearing.
– H0​: P(X = x) = 1/6 for x = 1, 2, 3, 4, 5, 6

Alternative Hypothesis (HA): The die is not fair,


– meaning the numbers do not appear with equal probability.
– HA: P(X = x) ≠ 1/6

Step2 – Do the experiment:


The last 90 tosses of this die gave the following results:

7
Chi-square Test – Example
Step2 – Do the experiment:
The last 90 tosses of this die gave the following results:

Observed counts are = 20, 15, 12, 17, 9, and 17.

The corresponding expected counts are


– = (90)(1/6) = 15

Step3 – Calculate the Chi-square test:

8
Chi-square Test – Example
Step3 – Calculate the Chi-square test:

Step4 – find the critical value:


df = r – 1 = 6 – 1  5
significance level α = 0.05
look in the chi-square table for df = 5 and α = 0.05

X2 = 11.1 > 5.2

It means that there is no significant evidence to reject H 0,


9
and therefore, no evidence that the die was biased.
Chi-square Table

10
Independence Analysis using Chi-square
Given the two way table, test whether the column and row
are independent.

11
Independence Analysis using Chi-square
The theoretical expected values if independent.

12
Chi-square Test – Example
Step1 – Define the hypothesis:

Null Hypothesis (H₀): the two variables (Gender and


Category) are independent.

Alternative Hypothesis (HA): the two variables (Gender


and Category) are dependent

Step2 – Do the experiment:


See previous two slides

Step3 – Calculate the Chi-square test:

13
Chi-square Test – Example
Step3 – Calculate the Chi-square test:

For boys

For girls

Total chi-square X2

14
Chi-square Test – Example
Step4 – find the critical value:
df = (r – 1) x (c - 1) = (3 – 1) x (2 – 1)  2
significance level α = 0.05
look in the chi-square table for df = 2 and α = 0.05

X2 = 5.99 < 22.5031

Since the calculated Chi-square statistic 22.5031 is greater


than the critical value 5.99, we reject the null hypothesis.

There is sufficient evidence to conclude that Gender and


Category (Grades, Popular, Sports) are dependent,
meaning the distribution of categories is related to gender.
15
Variance Estimator and Chi-square
Distribution

16
Variance Estimation

17
Chi-square – Sample Variance Distribution
When observations X1, . . . , Xn are independent and Normal with
Var(Xi) = σ2 , the distribution of

is Chi-square with (n − 1) degrees of freedom

18
Chi-square – Confidence Interval for
Population Variance

19
Critical Values of Chi-square Distribution

20
Conf Intervals: Variance and std deviation
 Since we want a confidence interval, we need to state that:

 Take the Reciprocal

 Multiply by (𝑛−1)𝑠2

 Find confidence interval

21
Example
Let we have a measurement device giving the data (n=6
measurements) as found below. Construct 90% confidence
interval for the standard deviation.

22
Example – Solution

actually, we only need (n − 1)s2 = 31.16

Chi-square distribution with ν = n − 1 = 5 degrees of freedom, we


find the critical values χ2 1−α/2 = χ2 0.95 = 1.15 and χ2 α/2 = χ2
0.05 = 11.1. Then

is a 90% confidence interval for the population standard deviation


(and by the way, [1.682 , 5.212 ] = [2.82, 27.14] is a 90%
confidence interval for the variance). 23
24

You might also like