10 InferentialStatistics
10 InferentialStatistics
Statistical Inference
• A statistical hypothesis is a conjecture (an opinion or
conclusion formed on the basis of incomplete
information) concerning one or more populations.
• The truth or falsity of a statistical hypothesis is never known
with absolute certainty unless we examine the entire
population
• However, this approach does not account for values of test statistics
that are “close” to the critical region.
• For each significance level, the Z-test has a single critical value
(for example, 1.96 for 5% two tailed) which makes it more
convenient
Critical region for alternative
hypothesis
The z Test: An Example
Given: μ = 156.5, σ = 14.6, x̅ = 156.11, N = 97
1. Populations, distributions, and test
– Populations: All students at UMD who have taken
the test (not just our sample)
– Distribution: Sample distribution of means
– Test : z test
The z Test: An Example
2. State the null (H0) and alternative (H1) hypothese
In Symbols… H0: μ1 = μ2
H1: μ1 ≠ μ2
14.6
M 1.482
n 97
The z Test: An Example
4. Determine critical value (cutoffs)
– In Behavioral Sciences, we use p = 0.05
– p = 0.05 = 5% 2.5% in each tail
– 50% - 2.5% = 47.5%
– Consult z table for 47.5% z = 1.96
The z Test: An Example
5. Calculate test statistic
z
x
(156.11 156.5)
0.26
M 1.482
6. Make a Decision
Does a Foos run faster?
• When I was growing up my father told me that our last name, Foos, was
German for foot (Fuβ) because our ancestors had been very fast runners. I
am curious whether there is any evidence for this claim in my family so I
have gathered running times for a distance of one mile from 6 family
members. The average healthy adult can run one mile in 10 minutes and
13 seconds (standard deviation of 76 seconds). Is my family running speed
different from the national average? Assume that running speed follows a
normal distribution.
Person Running Time …in seconds
Paul 13min 48sec 828sec
Phyllis 10min 10sec 610sec
Tom 7min 54sec 474sec
Aleigha 9min 22sec 562sec
Arlo 8min 38sec 518sec
David 9min 48sec 588sec
∑ = 3580
N =6
M = 596.667
Does a Foos run faster?
Given: μ = 613sec , σ = 76sec, x̅ = 596.667sec, N = 6
1. Populations, distributions, and assumptions
– Populations:
1. All individuals with the last name Foos.
2. All healthy adults.
– Distribution: Sample mean distribution of
means
– Test & Assumptions: We know μ and σ , so z test
Does a Foos run faster?
Given: μ = 613sec , σ = 76sec, x̅ = 596.667sec, N = 6
2. State the null (H0) and research (H1)hypotheses
H0: People with the last name Foos do not run at different
speeds than the national average.
76
M 31.02
N 6
Does a Foos run faster?
Given: μ = 613sec , σM = 31.02sec, x̅ = 596.667sec, N = 6
4. Determine critical value (cutoffs)
– In Behavioral Sciences, we use p = 0.05
– Our hypothesis (“People with the last name Foos do run
at different speeds (either slower or faster) than the
national average.”) is nondirectional so our hypothesis
test is two-tailed.
THIS z Table lists the percentage under
the normal curve, between the mean
(center of distribution) and the z
statistic.
zcrit = ±1.96
-1.96 +1.96
Does a Foos run faster?
Given: μ = 613sec , σM = 31.02sec, M = 596.667sec, N = 6
5. Calculate test statistic
z
x (596.667 613)
0.53
M 31.02
6. Make a Decision
Does a Foos run faster?
Given: μ = 613sec , σM = 31.02sec, x̅ = 596.667sec, N = 6
6. Make a Decision
z = 0-.53 < zcrit = ±1.96, fail to reject null hypothesis
The average one mile running time of Foos family members
is not different from the national average running time…the
myth is not true
• A random sample of 100 recorded deaths in the United States
during the past year showed an average life span of 71.8
years. Assuming a population standard deviation of 8.9 years,
does this seem to indicate that the mean life span today is
greater than 70 years? Use a 0.05 level of significance.
• Population: Citizen of USA who died
• Distribution: Mean distribution of sample
• Test: Z test
• Hypothesis:
• H0: μ = 70 years.
• H1: μ > 70 years.
• Critical region: z > 1.645, (α = 0.05, one tailed
test)
• , where Test Statistics: x̅ = 71.8 years, μ =70, σ =
8.9 years,
• z = (x̅−μ)/(σ/√n).
z = (71.8−70)/(8.9/√100)= 2.02.
• Decision: Reject H0 in favour of H1 and conclude
that the mean life span today is greater than 70
years.
• The P-value corresponding to z = 2.02 is P = P(Z >
2.02) = 0.0217.
DOES SAMPLE SIZE MATTER?
Increasing Sample Size
• By increasing sample size, one can increase the value
of the test statistic, thus increasing probability of
finding a significant effect
Why Increasing Sample Size Matters
• Example1: Psychology GRE scores
• Population: μ = 554, σ = 99
• Sample: x̅ = 568, N = 90
99
M 10.436
N 90
z
x (568 554)
1.34
M 10.436
Why Increasing Sample Size Matters
• Example2: Psychology GRE scores for N = 200
Population: μ = 554, σ = 99
Sample: x̅ = 568, N = 200
99
M 7.00
N 200
z
x (568 554)
2.00
M 7.00
Why Increasing Sample Size Matters
μ = 554, σ = 99, x̅ = 568 μ = 554, σ = 99, x̅ = 568
N = 90 N = 200
99 99
M 10.436 M 7.00
N 90 N 200
z = 1.34 z = 2.00
zcritical (p=.05) = ±1.96
Not significant, Significant,
fail to reject null reject null
hypothesis hypothesis
One Sample: Test on a Single
Proportion
• A builder claims that heat pumps are installed in 70% of all
homes being constructed today in the city of Richmond,
Virginia. Would you agree with this claim if a random survey
of new homes in this city showed that 8 out of 15 had heat
pumps installed? Use a 0.10 level of significance.
• H0: p = 0.7.
• H1: p != 0.7.
• α = 0.10
• Test statistic: Binomial variable X with p = 0.7 and n = 15.
Computations: x = 8 and mean(np) = (15)(0.7) = 10.5.
• P=P(X ≤ 8 when p = 0.7) + P(X≥13 when p = 0.7)
• = 2P(X ≤ 8 when p = 0.7) =2 8𝑥=0 𝑏 𝑥; 15,0.7 = 0.2622 > 0.1
μx x μx μx μ1 μ2
1 2 1 2
and
2 2
2 2 σ1 σ2
σ x x σ x σ x .
1 2 1 2
n1 n2
Sampling distribution
for x x σ x x μ1 μ2 σ x x x1 x 2
1 2 1 2 1 2
Two Sample z-Test for the Means
Example:
A high school math teacher claims that students in her class will
score higher on the math portion of the ACT then students in a
colleague’s math class. The mean ACT math score for 49 students
in her class is 22.1 and the standard deviation is 4.8. The mean ACT
math score for 44 of the colleague’s students is 19.8 and the
standard deviation is 5.4. At = 0.10, can the teacher’s claim be
supported?
H0: 1 = 2
= 0.10
Ha: 1 > 2 (Claim)
-3 -2 -1 0 1 2 3
z
z0 = 1.28
Continued.
Two Sample z-Test for the Means
Example continued:
H0: 1 = 2 z0 = 1.28
σ 12 σ 22 4.8 2
5.4 2
σ x x 1.0644.
1 2
n1 n2 49 44
The standardized test statistic is
z
x1 x 2 μ1 μ2 22.1 19.8 0
2.161
σ x x
1 2
1.0644
There is enough evidence at the 10% level to support the teacher’s claim that her students
score better on the ACT.
Two Samples: Tests on Two
Proportions
• If p1 and p2 are proportion of success in two population, If we
draw random sample from two population of size n1 and n2
which are sufficiently large, then P1̅ (sample proportion )
minus P2̅ will be approximately normally distributed with
mean and variance
•µ P1̅-P2̅=p1̅-p2̅
• Therefore, our critical region(s) can be established by using
the standard normal variable
( Xi X ) 2
s
N 1
What is a T-distribution?
• A t-distribution is like a Z distribution, except has slightly fatter tails
to reflect the uncertainty added by estimating .
• The bigger the sample size (i.e., the bigger the sample size used to
estimate ), then the closer t becomes to Z.
• If n>=30, t approaches Z.
Note: t Z as n increases
Standard
Normal
(t with df = )
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
Single-Sample t Test: Attendance in
Therapy Sessions
• Our Counseling center on campus is concerned that most students
requiring therapy do not take advantage of their services. Right
now students attend only 4.6 sessions in a given year!
Administrators are considering having patients sign a contract
stating they will attend at least 10 sessions in an academic year.
• Question: Does signing the contract actually increase
participation/attendance?
• We had 5 patients who signed the contract and we counted the
number of times they attended therapy sessions
Number of Attended Therapy Sessions
6
6
12
7
8
Single-Sample t Test: Attendance in
Therapy Sessions
1. Identify
– Populations:
• Pop 1: All clients who sign contract
• Pop 2: All clients who do not sign contract
– Distribution:
• One Sample mean: Distribution of sample means of
pop2
H0: Clients who sign the contract will attend the same number
of sessions as those who do not sign the contract.
𝒊=𝟏 (Xi−X̅)
𝒏 2
X̅ = 7.8 = 24.8
( Xi X ) 2 s 2.490
s
24.8
2.490 sM 1.114
N 1 5 1 N 5
Single-Sample t Test: Attendance in
Therapy Sessions
μ = 4.6, sM = 1.114, X̅ = 7.8, N = 5, df = 4
4. Determine critical value (cutoffs)
– In Behavioral Sciences, we use p = .05 (5%)
– Our hypothesis (“Clients who sign the contract will
attend a different number of sessions than those
who do not sign the contract.”) is nondirectional so
our hypothesis test is two-tailed.
df = 4
Single-Sample t Test: Attendance in
Therapy Sessions
μ = 4.6, sM = 1.114, X̅ = 7.8, N = 5, df = 4
4. Determine critical value (cutoffs)
tcrit = ± 2.776
Single-Sample t Test: Attendance in
Therapy Sessions
μM = 4.6, sM = 1.114, x̅ = 7.8, N = 5, df = 4
5. Calculate the test statistic
( X ) (7.8 4.6)
t 2.873
sM 1.114
-2.76 +2.76
Single-Sample t Test: Attendance in
Therapy Sessions
μ = 4.6, sM = 1.114, X̅ = 7.8, N = 5, df = 4
2.873
6. Make a decision
t = 2.873 > tcrit = ±2.776, reject the null hypothesis
6. Make Decision
Not sufficient evidence to reject the null hypothesis. We
cannot sue the light bulb manufacturer for false
advertising!
• The Edison Electric Institute has published figures on the
number of kilowatt hours used annually by various home
appliances. It is claimed that a vacuum cleaner uses an
average of 46 kilowatt hours per year. If a random sample of
12 homes included in a planned study indicates that vacuum
cleaners use an average of 42 kilowatt hours per year with a
standard deviation of 11.9 kilowatt hours, does this suggest at
the 0.05 level of significance that vacuum cleaners use, on an
average, less than 46 kilowatt hours annually? Assume the
population of kilowatt hours to be normal.
• H0: μ = 46 kilowatt hours.
• H1: μ < 46 kilowatt hours (one tailed, left tailed)
• α = 0.05.
• Critical region: t < −1.796, where 11 is degrees of
freedom (from table).
• Computations: t = (x̅−μ)/(s/√n) x̅ = 42, μ = 46 , s = 11.9
and n = 12.
• Hence, t =(42 − 46)/(11.9/√12)= −1.16,
• Decision: Do not reject H0 and conclude that the
average number of kilowatt hours used annually by
home vacuum cleaners is not significantly less than 46.
Testing the Difference Between Means
(Small Independent Samples)
Two Sample t-Test
If samples of size less than 30 are taken from normally-distributed populations, a t-test
may be used to test the difference between the population means μ1 and μ2.
Three conditions are necessary to use a t-test for small independent samples.
t
x1 x 2 μ1 μ2 .
σ x x 1 2
If the population variances are equal, then information from the two
samples is combined to calculate a pooled estimate of the standard
deviation σˆ can be calculated as follows
σˆ
n1 1 s12 n2 1 s22
n1 n2 2 Continued.
Two Sample t-Test
Two-Sample t-Test (Continued)
The standard error for the sampling distribution of x1 x 2 is
σ x x σˆ 1 1
1 2
n1 n2 Variances equal
and d.f.= n1 + n2 – 2.
If the population variances are not equal, then the standard error is
s12 s 22
σ x x Variances not equal
1 2
n1 n2
In Words In Symbols
Continued.
Two Sample t-Test for the Means
Using a Two-Sample t-Test for the Difference Between Means (Small Independent
Samples)
In Words In Symbols
t
d.f. = n1 + n2 – 2 -3 -2 -1 0 1 2 3
t0 = 2.576
–t0 = –2.576
= 17 + 18 – 2 = 33 Continued.
Two Sample t-Test for the Means
Example continued:
H0: 1 = 2
Ha: 1 2 (Claim) t
-3 -2 -1 0 1 2 3
–t0 = –2.576 t0 = 2.576
The standardized error is
σ x x σˆ 1
1
n1 1 s12 n2 1 s22 1
1
1 2
n1 n2 n1 n2 2 n1 n2
17 1 78002 18 1 73752 1 1
17 18 2 17 18
7584.0355(0.3382)
2564.92 Continued.
Two Sample t-Test for the Means
Example continued:
H0: 1 = 2
Ha: 1 2 (Claim) t
-3 -2 -1 0 1 2 3
–t0 = –2.576 t0 = 2.576
t
x1 x 2 μ1 μ2 35800 35100 0
σ 0.273
x x
1 2
2564.92
Fail to reject H0.
There is not enough evidence at the 1% level to support the claim that the mean annual
incomes differ.
Normal or t-Distribution?
Are both sample sizes at least
Yes Use the z-test.
30?
No
Example:
A researcher claims that the distribution of favorite pizza toppings
among teenagers is as shown below.
Topping Frequency, f
Each outcome is Cheese 41% The probability for
classified into Pepperoni 25% each possible
categories. Sausage 15% outcome is fixed.
Mushrooms 10%
Onions 9%
Chi-Square Goodness-of-Fit Test
A Chi-Square Goodness-of-Fit Test is used to test whether an observed
frequency distribution fits an expected distribution.
To calculate the test statistic for the chi-square goodness-of-fit test, the
observed frequencies and the expected frequencies are used.
In Words In Symbols
Continued.
Chi-Square Goodness-of-Fit Test
Performing a Chi-Square Goodness-of-Fit Test
In Words In Symbols
2 k (Oi Ei )2
6. Calculate the test statistic. χ
i 1 Ei
Topping Observed
Frequency, f
Cheese 39%
Pepperoni 26%
Sausage 15%
Mushrooms 12.5%
Onions 7.5%
Using = 0.01, and the observed and expected values previously calculated, test the
surveyor’s claim using a chi-square goodness-of-fit test.
Continued.
Chi-Square Goodness-of-Fit Test
Example continued:
Continued.
Chi-Square Goodness-of-Fit Test
Example continued:
Topping Observed Expected
Rejection
Frequency Frequency
region
Cheese 78 82
0.01 Pepperoni 52 50
Sausage 30 30
X2
Mushrooms 25 20
χ20 = 13.277 Onions 15 18
2 k (Oi Ei )2 (78 82)2 (52 50)2 (30 30)2 (25 20)2 (15 18)2
χ
82 50 30 20 18
i 1 Ei
2.025
Fail to reject H0.
There is not enough evidence at the 1% level to reject the surveyor’s claim.
Another Example
• In a study of vehicle ownership, it has been found that 13.5% of
U.S. households do not own a vehicle, with 33.7% owning 1
vehicle, 33.5% owning 2 vehicles, and 19.3% owning 3 or more
vehicles. The data for a random sample of 100 households in a
resort community are summarized below. At the 0.05 level of
significance, can we reject the possibility that the vehicle-
ownership distribution in this community differs from that of the
nation as a whole?
# Vehicles Owned # Households
0 20
1 35
2 23
3 or more 22
Goodness-of-Fit: An Example
I
H0: Observed distribution in this community is the same as it is
in the nation as a whole.
H1: Vehicle-ownership distribution in this community is not the
same as it is in the nation as a whole.
# Vehicles Oj Ej [Oj– Ej ]2/ Ej
0 20 13.5 3.1296
1 35 33.7 0.0501
2 23 33.5 3.2910
3+ 22 19.3 0.3777
Sum = 6.8484
Goodness-of-Fit: An Example
II. Rejection Region:
= 0.05
df = k – 1 = 4 – 1 = 3
Do Not Reject H0 Reject H0
III. Test Statistic: 0.95 0.05
c2 = 6.8484 c 2=7.815
IV. Conclusion: Since the test statistic of c2 = 6.8484 falls below the
critical value of c2 = 7.815, we do not reject H0 with at least 95%
confidence.
V. Implications: There is not enough evidence to show that vehicle
ownership in this community differs from that in the nation as a
whole.
Independence using Chi-
square test
Chi-Square Independence Test
A chi-square independence test is used to test the independence
of two variables. Using a chi-square test, you can determine
whether the occurrence of one variable affects the probability of
the occurrence of the other variable.
Contingency Tables
An r c contingency table shows the observed frequencies for two
variables. The observed frequencies are arranged in r rows and c
columns. The intersection of a row and a column is called a cell.
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and older
Male 32 51 52 43 28 10
Female 13 22 33 21 10 6
An Integrated Definition of
Independence
• From basic probability:
If two events are independent
P(A and B) = P(A) • P(B)
Continued.
Expected Frequency
Example continued:
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
Female 13 22 33 21 10 6 105
Total 45 73 85 64 38 16 321
(Sum of row r ) (Sum of column c )
Expected frequency E r ,c
Sample size
In Words In Symbols
Continued.
Chi-Square Independence Test
Performing a Chi-Square Independence Test
In Words In Symbols
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
(30.28) (49.12) (57.20) (43.07) (25.57) (10.77)
Female 13 22 33 21 10 6 (5.23) 105
(14.72) (23.88) (27.80) (20.93) (12.43)
45 73 85 64 38 16 321
Chi-Square Independence Test
Example continued:
Because each expected frequency is at least 5 and the drivers were randomly selected,
the chi-square independence test can be used to test whether the variables are
independent.
Continued.
Chi-Square Independence Test
Example continued:
O E O–E (O – E)2 (O E )2
Rejection E
32 30.28 1.72 2.9584 0.0977
region
51 49.12 1.88 3.5344 0.072
0.05 52 57.20 5.2 27.04 0.4727
43 43.07 0.07 0.0049 0.0001
X2 28 25.57 2.43 5.9049 0.2309
χ20 = 11.071
10 10.77 0.77 0.5929 0.0551
13 14.72 1.72 2.9584 0.201
(O E )2
22 23.88 1.88 3.5344 0.148
2
χ 2.84 33 27.80 5.2 27.04 0.9727
E
21 20.93 0.07 0.0049 0.0002
Fail to reject H0. 10 12.43 2.43 5.9049 0.4751
6 5.23 0.77 0.5929 0.1134
There is not enough evidence at the 5% level to conclude that age is dependent on
gender in such accidents.