0% found this document useful (0 votes)
90 views

10 InferentialStatistics

Statistical inference involves using data from a sample to draw conclusions about a population. A statistical hypothesis is a conjecture about a population that is tested using sample data. The null hypothesis states that there is no effect or no difference, while the alternative hypothesis states there is an effect or difference. Researchers aim to reject the null hypothesis if evidence from the sample contradicts it. The p-value approach provides the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. This allows researchers to evaluate how likely it is that the null hypothesis should be rejected.

Uploaded by

aryan jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views

10 InferentialStatistics

Statistical inference involves using data from a sample to draw conclusions about a population. A statistical hypothesis is a conjecture about a population that is tested using sample data. The null hypothesis states that there is no effect or no difference, while the alternative hypothesis states there is an effect or difference. Researchers aim to reject the null hypothesis if evidence from the sample contradicts it. The p-value approach provides the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. This allows researchers to evaluate how likely it is that the null hypothesis should be rejected.

Uploaded by

aryan jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Inferential Statistics

Statistical Inference
• A statistical hypothesis is a conjecture (an opinion or
conclusion formed on the basis of incomplete
information) concerning one or more populations.
• The truth or falsity of a statistical hypothesis is never known
with absolute certainty unless we examine the entire
population

• we take a random sample from the population of interest and


use the data contained in this sample to provide evidence that
either supports or does not support the hypothesis

• Evidence from the sample that is inconsistent with the stated


hypothesis leads to a rejection of the hypothesis.
The Role of Probability in Hypothesis Testing

• suppose that the hypothesis postulated by the engineer is


that the fraction defective p in a certain process is 0.10.
• Suppose that 100 items are tested and 12 items are found
defective.
• It is reasonable to conclude that this evidence does not refute
the condition that the binomial parameter p = 0.10
• However, it also does not refute the chance that actually
p=0.12 or even higher
The Role of Probability in Hypothesis Testing

• But if we find 20 items defective, then we will get high


confidence and refute the hypothesis.
• firm conclusion is established by the data analyst when a
hypothesis is rejected.
• If the scientist is interested in strongly supporting a
contention, he or she hopes to arrive at the contention in the
form of rejection of a hypothesis.
• For example, If the medical researcher wishes to show strong
evidence in favor of the contention that coffee drinking
increases the risk of cancer, the hypothesis tested should be
of the form “there is no increase in cancer risk produced by
drinking coffee.”
The Null and Alternative Hypotheses
• Null hypothesis is a general statement or default position
(status quo) and it is generally assumed to be true until
evidence indicates otherwise. It is denoted by H0
• Alternative hypothesis is a position that states something is
happening, a new theory is preferred instead of an old one. It
is denoted by H1/Ha
• The null hypothesis H0 nullifies or opposes H1 and is often the
logical complement to H1
• conclusions:
– reject H0 in favor of H1 because of sufficient evidence in the data
or
– fail to reject H0 because of insufficient evidence in the data.
example
• H0: defendant is innocent,
• H1: defendant is guilty.
• The indictment comes because of suspicion of
guilt. The hypothesis H0 (the status quo) stands in
opposition to H1 and is maintained unless H1 is
supported by evidence “beyond a reasonable
doubt.”
• However, “fail to reject H0” in this case does not
imply innocence, but merely that the evidence
was insufficient to convict. So the jury does not
necessarily accept H0 but fails to reject H0.
Testing a Statistical Hypothesis
• A certain type of cold vaccine (A) is known to be only 25% effective
after a period of 2 years.

• Another vaccine (B) is to be tested if it is better than A

• Suppose that 20 people are chosen at random and inoculated with


B

• If more than 8 of those receiving B surpass the 2-year period


without contracting the virus, then B will be considered superior to
A

• The requirement that the number exceed 8 is somewhat arbitrary


but appears reasonable in that it represents a modest gain over the
5 people
• We are essentially testing the null hypothesis that B is less or
equally effective after a period of 2 years as A
• The alternative hypothesis is that the B is in fact superior
• H0: p <= 0.25,
• H1: p > 0.25.
The Test Statistic
• The test statistic on which we base our decision is X, the
number of individuals in our test group who receive
protection from the new vaccine for a period of at least 2
years. The possible values of X, from 0 to 20, are divided into
two groups: those numbers less than or equal to 8 and those
greater than 8.
• All possible scores greater than 8 constitute the critical
region.
• if x > 8, we reject H0 in favor of the alternative hypothesis H1.
If x ≤ 8, we fail to reject H0.
Types of Error
• Rejection of the null hypothesis when it is true
is called a type I error.
• Non-rejection of the null hypothesis when it is
false is called a type II error.
Probability of committing a type I error
• The probability of committing a type I error, also called the
level of significance (also called size of the test), is denoted
by the Greek letter α.
• As per the last example, a type I error will occur when more
than 8 individuals inoculated with B surpass the 2-year period
without contracting the virus and researchers conclude that B
is better when it is actually equivalent to A.
20 1
• α = P(type I error) = P (X >8 when p =1/4)= 𝑥=9 𝑏(𝑋, 20, ) =0.0409
4
• We say that the null hypothesis, p = 1/4, is being tested at the
α = 0.0409 level of significance.
• Therefore chance is very low that a type I error will be
committed.
The Probability of a Type II Error
• The probability of committing a type II error, denoted by β, is
impossible to compute unless we have a specific alternative
hypothesis. If we test the null hypothesis that p = 1/4 against
the alternative hypothesis that p = 1/2, then we are able to
compute the probability of not rejecting H0 when it is false.
8 1
• β = P(type II error) = P (X ≤ 8 when p =1/2)= 𝑥=0 𝑏(𝑥, 20, )
2
= 0.2517
• A real estate agent claims that 60% of all private residences being
built today are 3-bedroom homes. To test this claim, a large sample
of new residences is inspected; the proportion of these homes with
3 bedrooms is recorded and used as the test statistic. State the null
and alternative hypotheses to be used in this test.
• If the test statistic were substantially higher or lower than p = 0.6,
we would reject the agent’s claim. Hence, we should make the
hypothesis
H0: p = 0.6,
H1: p != 0.6.
• The alternative hypothesis implies a two-tailed test with the critical
region divided equally in both tails of the distribution of P, our test
statistic.
The Use of P-Values for Decision Making in
Testing Hypotheses
• In testing hypotheses in which the test statistic is discrete, the
critical region may be chosen arbitrarily and its size determined. If α
is too large, it can be reduced by making an adjustment in the
critical value.
• Over a number of generations of statistical analysis, it had
become customary to choose an α of 0.05 or 0.01 and select the
critical region accordingly.
• if the test is two tailed and α is set at the 0.05 level of significance
and the test statistic involves, say, the standard normal distribution,
then a z-value is observed from the data and the critical region is z >
1.96 or z < −1.96,
• A value of z in the critical region prompts the statement “The value
of the test statistic is significant”
Two-tailed versus One-Tailed
Pre-selection of a Significance Level
• This pre-selection of a significance level α has its roots in the
philosophy that the maximum risk of making a type I error should
be controlled.

• However, this approach does not account for values of test statistics
that are “close” to the critical region.

• Suppose, for example, H0 : μ = 10 versus H1: μ != 10, a value of z =


1.87 is observed; strictly speaking, with α = 0.05, the value is not
significant. But the risk of committing a type I error if one rejects H0
in this case could hardly be considered severe. In fact, in a two-
tailed scenario, one can quantify this risk as
P = 2P(Z > 1.87 when μ = 10) = 2(0.0307) = 0.0614.
• The P-value approach has been adopted extensively by users
of applied statistics. The approach is designed to give the
user an alternative (in terms of a probability) to a mere
“reject” or “do not reject” conclusion.
Testing Hypotheses
1. Identify the population, distribution, inferential test
2. State the null and alternative hypotheses
3. Determine characteristics of the distribution
4. Determine critical values or cutoffs
5. Calculate test statistic (e.g., z statistic)
6. Make a decision
Single Sample: Tests Concerning a
Single Mean (variance known)
Z-test
• A Z-test is any statistical test for which the distribution of
the test statistic can be approximated by a normal
distribution.

• Z-test tests the mean of a distribution in which we already


know the population variance σ2 .

• For each significance level, the Z-test has a single critical value
(for example, 1.96 for 5% two tailed) which makes it more
convenient
Critical region for alternative
hypothesis
The z Test: An Example
Given: μ = 156.5, σ = 14.6, x̅ = 156.11, N = 97
1. Populations, distributions, and test
– Populations: All students at UMD who have taken
the test (not just our sample)
– Distribution: Sample  distribution of means
– Test : z test
The z Test: An Example
2. State the null (H0) and alternative (H1) hypothese
In Symbols… H0: μ1 = μ2
H1: μ1 ≠ μ2

In Word H0: Mean of pop 1 will be equal to the mean of


pop 2

H1: Mean of pop 1 will be different from the


mean of pop 2
The z Test: An Example
3. Determine characteristics of distribution.
– Population: μ = 156.5, σ = 14.6
– Sample: x̅ = 156.11, n = 97

 14.6
M    1.482
n 97
The z Test: An Example
4. Determine critical value (cutoffs)
– In Behavioral Sciences, we use p = 0.05
– p = 0.05 = 5%  2.5% in each tail
– 50% - 2.5% = 47.5%
– Consult z table for 47.5%  z = 1.96
The z Test: An Example
5. Calculate test statistic

z
x  
(156.11  156.5)
 0.26
M 1.482
6. Make a Decision
Does a Foos run faster?
• When I was growing up my father told me that our last name, Foos, was
German for foot (Fuβ) because our ancestors had been very fast runners. I
am curious whether there is any evidence for this claim in my family so I
have gathered running times for a distance of one mile from 6 family
members. The average healthy adult can run one mile in 10 minutes and
13 seconds (standard deviation of 76 seconds). Is my family running speed
different from the national average? Assume that running speed follows a
normal distribution.
Person Running Time …in seconds
Paul 13min 48sec 828sec
Phyllis 10min 10sec 610sec
Tom 7min 54sec 474sec
Aleigha 9min 22sec 562sec
Arlo 8min 38sec 518sec
David 9min 48sec 588sec
∑ = 3580
N =6
M = 596.667
Does a Foos run faster?
Given: μ = 613sec , σ = 76sec, x̅ = 596.667sec, N = 6
1. Populations, distributions, and assumptions
– Populations:
1. All individuals with the last name Foos.
2. All healthy adults.
– Distribution: Sample mean  distribution of
means
– Test & Assumptions: We know μ and σ , so z test
Does a Foos run faster?
Given: μ = 613sec , σ = 76sec, x̅ = 596.667sec, N = 6
2. State the null (H0) and research (H1)hypotheses

H0: People with the last name Foos do not run at different
speeds than the national average.

H1: People with the last name Foos do run at different


speeds (either slower or faster) than the national
average.
Does a Foos run faster?
Given: μ = 613sec , σ = 76sec, x̅ = 596.667sec, N = 6
3. Determine characteristics of comparison
distribution (distribution of sample means).
– Population: μ = 613.5sec, σ = 76sec
– Sample: x̅ = 596.667sec, N = 6

 76
M    31.02
N 6
Does a Foos run faster?
Given: μ = 613sec , σM = 31.02sec, x̅ = 596.667sec, N = 6
4. Determine critical value (cutoffs)
– In Behavioral Sciences, we use p = 0.05
– Our hypothesis (“People with the last name Foos do run
at different speeds (either slower or faster) than the
national average.”) is nondirectional so our hypothesis
test is two-tailed.
THIS z Table lists the percentage under
the normal curve, between the mean
(center of distribution) and the z
statistic.

5% (p=.05) / 2 = 2.5% from each side


100% - 2.5% = 97.5%
97.5% = 50% + 47.5%

zcrit = ±1.96

-1.96 +1.96
Does a Foos run faster?
Given: μ = 613sec , σM = 31.02sec, M = 596.667sec, N = 6
5. Calculate test statistic

z
 x    (596.667  613)
  0.53
M 31.02

6. Make a Decision
Does a Foos run faster?
Given: μ = 613sec , σM = 31.02sec, x̅ = 596.667sec, N = 6

6. Make a Decision
z = 0-.53 < zcrit = ±1.96, fail to reject null hypothesis
The average one mile running time of Foos family members
is not different from the national average running time…the
myth is not true
• A random sample of 100 recorded deaths in the United States
during the past year showed an average life span of 71.8
years. Assuming a population standard deviation of 8.9 years,
does this seem to indicate that the mean life span today is
greater than 70 years? Use a 0.05 level of significance.
• Population: Citizen of USA who died
• Distribution: Mean distribution of sample
• Test: Z test
• Hypothesis:
• H0: μ = 70 years.
• H1: μ > 70 years.
• Critical region: z > 1.645, (α = 0.05, one tailed
test)
• , where Test Statistics: x̅ = 71.8 years, μ =70, σ =
8.9 years,
• z = (x̅−μ)/(σ/√n).
z = (71.8−70)/(8.9/√100)= 2.02.
• Decision: Reject H0 in favour of H1 and conclude
that the mean life span today is greater than 70
years.
• The P-value corresponding to z = 2.02 is P = P(Z >
2.02) = 0.0217.
DOES SAMPLE SIZE MATTER?
Increasing Sample Size
• By increasing sample size, one can increase the value
of the test statistic, thus increasing probability of
finding a significant effect
Why Increasing Sample Size Matters
• Example1: Psychology GRE scores
• Population: μ = 554, σ = 99
• Sample: x̅ = 568, N = 90
 99
M    10.436
N 90

z
 x    (568  554)
  1.34
M 10.436
Why Increasing Sample Size Matters
• Example2: Psychology GRE scores for N = 200
Population: μ = 554, σ = 99
Sample: x̅ = 568, N = 200

 99
M    7.00
N 200

z
 x    (568  554)
  2.00
M 7.00
Why Increasing Sample Size Matters
μ = 554, σ = 99, x̅ = 568 μ = 554, σ = 99, x̅ = 568
N = 90 N = 200
 99  99
M    10.436 M    7.00
N 90 N 200

z = 1.34 z = 2.00
zcritical (p=.05) = ±1.96
Not significant, Significant,
fail to reject null reject null
hypothesis hypothesis
One Sample: Test on a Single
Proportion
• A builder claims that heat pumps are installed in 70% of all
homes being constructed today in the city of Richmond,
Virginia. Would you agree with this claim if a random survey
of new homes in this city showed that 8 out of 15 had heat
pumps installed? Use a 0.10 level of significance.
• H0: p = 0.7.
• H1: p != 0.7.
• α = 0.10
• Test statistic: Binomial variable X with p = 0.7 and n = 15.
Computations: x = 8 and mean(np) = (15)(0.7) = 10.5.
• P=P(X ≤ 8 when p = 0.7) + P(X≥13 when p = 0.7)
• = 2P(X ≤ 8 when p = 0.7) =2 8𝑥=0 𝑏 𝑥; 15,0.7 = 0.2622 > 0.1

• Decision: Do not reject H0. Conclude that there is insufficient


evidences to doubt the builder’s claim.
• Could we use normal distribution to
approximate earlier example?
• n=15, p =0.7 q=0.3
• np=10.5 nq=4.5
• nq<5
• A commonly prescribed drug for relieving nervous tension is
believed to be only 60% effective. Experimental results with a
new drug administered to a random sample of 100 adults who
were suffering from nervous tension show that 70 received
relief. Is this sufficient evidence to conclude that the new drug
is superior to the one commonly prescribed? Use a 0.05 level
of significance.
• H0: p = 0.6.
• H1: p > 0.6.
• α = 0.05.
• Critical region: z > 1.645 (one tailed test) [np and nq >5]
• Computations: x̅ = 70, n = 100, and
• z =(x̅ − np)/(√𝑛𝑝𝑞)
=(70-60 )/( 100 ∗ 0.6 ∗ 0.4)
= 1/( 0.24 =2.04,
Decision: Reject H0 and conclude that the new drug is superior
Testing the Difference Between Means
(Large Independent Samples)
Two Sample z-Test
If these requirements are met, the sampling distribution for x1  x 2 (the
difference of the sample means) is a normal distribution with mean and
standard error of

μx  x  μx  μx  μ1  μ2
1 2 1 2

and
2 2
2 2 σ1 σ2
σ x x  σ x  σ x   .
1 2 1 2
n1 n2
Sampling distribution
for x  x σ x  x μ1  μ2 σ x x x1  x 2
1 2 1 2 1 2
Two Sample z-Test for the Means
Example:
A high school math teacher claims that students in her class will
score higher on the math portion of the ACT then students in a
colleague’s math class. The mean ACT math score for 49 students
in her class is 22.1 and the standard deviation is 4.8. The mean ACT
math score for 44 of the colleague’s students is 19.8 and the
standard deviation is 5.4. At  = 0.10, can the teacher’s claim be
supported?
H0: 1 = 2
 = 0.10
Ha: 1 > 2 (Claim)

-3 -2 -1 0 1 2 3
z
z0 = 1.28
Continued.
Two Sample z-Test for the Means
Example continued:

H0: 1 = 2 z0 = 1.28

Ha: 1 > 2 (Claim)


-3 -2 -1 0 1 2 3
z

The standardized error is Reject H0.

σ 12 σ 22 4.8 2
5.4 2
σ x x    1.0644.
1 2
n1 n2  49  44
The standardized test statistic is

z
x1  x 2  μ1  μ2 22.1  19.8  0
  2.161
σ x x
1 2
1.0644
There is enough evidence at the 10% level to support the teacher’s claim that her students
score better on the ACT.
Two Samples: Tests on Two
Proportions
• If p1 and p2 are proportion of success in two population, If we
draw random sample from two population of size n1 and n2
which are sufficiently large, then P1̅ (sample proportion )
minus P2̅ will be approximately normally distributed with
mean and variance

•µ P1̅-P2̅=p1̅-p2̅
• Therefore, our critical region(s) can be established by using
the standard normal variable

• When H0 is true, we can substitute p1 = p2 = p and q1 = q2 = q


(where p and q are the common values) in the preceding
formula for Z to give the form

• Upon pooling the data from both samples, the pooled


estimate of the proportion p is
• A vote is to be taken among the residents of a town and the
surrounding county to determine whether a proposed
chemical plant should be constructed. The construction site is
within the town limits, and for this reason many voters in the
county believe that the proposal will pass because of the large
proportion of town voters who favor the construction. To
determine if there is a significant difference in the proportions
of town voters and county voters favoring the proposal, a poll
is taken. If 120 of 200 town voters favor the proposal and 240
of 500 county residents favor it, would you agree that the
proportion of town voters favoring the proposal is higher than
the proportion of county voters? Use an α = 0.05 level of
significance.
• Let p1 and p2 be the true proportions of voters in the town
and county, respectively, favoring the proposal.
• H0: p1 = p2.
• H1: p1 > p2.
• α = 0.05.
• Critical region: z > 1.645 (one tailed)
Single Sample: Tests Concerning a
Single Mean (variance unknown)
• In last few scenarios that we explained, it was assumed that
the population standard deviation is known. This assumption
may not be unreasonable in situations where the engineer is
quite familiar with the system or process.

• However, in many experimental scenarios, knowledge of σ is


certainly no more reasonable than knowledge of the
population mean μ. Often, in fact, an estimate of σ must be
supplied by the same sample information that produced the
sample average x̅.
Using Samples to Estimate Population
Variability
• Acknowledge error
• Smaller samples, less spread

( Xi  X ) 2
s
N 1
What is a T-distribution?
• A t-distribution is like a Z distribution, except has slightly fatter tails
to reflect the uncertainty added by estimating .

• The bigger the sample size (i.e., the bigger the sample size used to
estimate ), then the closer t becomes to Z.

• If n>=30, t approaches Z.

• Let X1,X2, . . . , Xn be independent random variables that are all


normal with mean μ and standard deviation σ. Let
1
X̅= 𝑛
𝑖 𝑋𝑖 and 2
S= 𝑛
𝑖=1 𝑋𝑖 − 𝑋 2
𝑛−1
𝑋−µ
• Then the random variable T = has a t-distribution with v = n − 1
𝑆/√𝑛
degrees of freedom.
What happened to σM?

• We have a new measure of standard deviation for a


sample mean distribution or standard error of the mean
(SEM) (as opposed to a population):
– We need a new measure of standard error based on sample
standard deviation:
s
sM 
N
– Wait, what happened to “N-1”?
– We already did that when we calculated s, don’t correct again!
Degrees of Freedom
• The number of scores that are free to vary when
estimating a population parameter from a sample
– df = N – 1 (for a Single-Sample t Test)
Student’s t Distribution

Note: t Z as n increases

Standard
Normal
(t with df = )

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
Single-Sample t Test: Attendance in
Therapy Sessions
• Our Counseling center on campus is concerned that most students
requiring therapy do not take advantage of their services. Right
now students attend only 4.6 sessions in a given year!
Administrators are considering having patients sign a contract
stating they will attend at least 10 sessions in an academic year.
• Question: Does signing the contract actually increase
participation/attendance?
• We had 5 patients who signed the contract and we counted the
number of times they attended therapy sessions
Number of Attended Therapy Sessions
6
6
12
7
8
Single-Sample t Test: Attendance in
Therapy Sessions
1. Identify
– Populations:
• Pop 1: All clients who sign contract
• Pop 2: All clients who do not sign contract

– Distribution:
• One Sample mean: Distribution of sample means of
pop2

– Test & Assumptions: Population mean is known but not


standard deviation  single-sample t test
Single-Sample t Test: Attendance in
Therapy Sessions
2. State the null and research hypotheses

H0: Clients who sign the contract will attend the same number
of sessions as those who do not sign the contract.

H1: Clients who sign the contract will attend a different


number of sessions than those who do not sign the
contract.
Single-Sample t Test: Attendance in
Therapy Sessions
3. Determine characteristics of comparison distribution
(distribution of sample means of pop2)
– Population1: μ = 4.6times
– Sample: X̅ = ____times,
7.8 s = _____, 1.114
2.490 sM = ______
# of Sessions (X) Xi-X̅ (Xi-X̅)2
6 -1.8 3.24
6 -1.8 3.24
12 -4.2 17.64
7 -0.8 0.64
8 0.2 0.04

𝒊=𝟏 (Xi−X̅)
𝒏 2
X̅ = 7.8 = 24.8

( Xi  X ) 2 s 2.490
s 
24.8
 2.490 sM    1.114
N 1 5 1 N 5
Single-Sample t Test: Attendance in
Therapy Sessions
μ = 4.6, sM = 1.114, X̅ = 7.8, N = 5, df = 4
4. Determine critical value (cutoffs)
– In Behavioral Sciences, we use p = .05 (5%)
– Our hypothesis (“Clients who sign the contract will
attend a different number of sessions than those
who do not sign the contract.”) is nondirectional so
our hypothesis test is two-tailed.

df = 4
Single-Sample t Test: Attendance in
Therapy Sessions
μ = 4.6, sM = 1.114, X̅ = 7.8, N = 5, df = 4
4. Determine critical value (cutoffs)

tcrit = ± 2.776
Single-Sample t Test: Attendance in
Therapy Sessions
μM = 4.6, sM = 1.114, x̅ = 7.8, N = 5, df = 4
5. Calculate the test statistic

( X   ) (7.8  4.6)
t   2.873
sM 1.114

-2.76 +2.76
Single-Sample t Test: Attendance in
Therapy Sessions
μ = 4.6, sM = 1.114, X̅ = 7.8, N = 5, df = 4

2.873

6. Make a decision
t = 2.873 > tcrit = ±2.776, reject the null hypothesis

Clients who sign a contract will attend different number


of sessions than those who do not sign a contract,
More Problem:
• A manufacturer of light bulbs claims that its light bulbs have a
mean life of 1520 hours with an unknown standard deviation.
A random sample of 40 such bulbs is selected for testing. If
the sample produces a mean value of 1505 hours and a
sample standard deviation of 86, is there sufficient evidence
to claim that the mean life is significantly less than the
manufacturer claimed?
– Assume that light bulb lifetimes are roughly normally distributed.
Answer
1. Population: All light bulb manufactured by manufacturer
Distribution: sample mean of the population
Test: T test
2. State hypothesis
Null hypothesis(H0): mean life = 1520 hours
Alternative hypothesis (H1): mean life < 1520 hours (one
tailed)
3. Determine characteristics of the comparison distribution:
86
X 40 ~ t39 (1520, s X   13.5)
40
4. Determine Cutoff:
Assume =0.05, left tailed, n>30, we are taking z
distribution for finding cutoff
P(Z<Zcritical)=0.05
Zcritical = -1.645

5. Determine test characteristics:


1505−1520
𝑡39 = =-1.11
13.5

6. Make Decision
Not sufficient evidence to reject the null hypothesis. We
cannot sue the light bulb manufacturer for false
advertising!
• The Edison Electric Institute has published figures on the
number of kilowatt hours used annually by various home
appliances. It is claimed that a vacuum cleaner uses an
average of 46 kilowatt hours per year. If a random sample of
12 homes included in a planned study indicates that vacuum
cleaners use an average of 42 kilowatt hours per year with a
standard deviation of 11.9 kilowatt hours, does this suggest at
the 0.05 level of significance that vacuum cleaners use, on an
average, less than 46 kilowatt hours annually? Assume the
population of kilowatt hours to be normal.
• H0: μ = 46 kilowatt hours.
• H1: μ < 46 kilowatt hours (one tailed, left tailed)
• α = 0.05.
• Critical region: t < −1.796, where 11 is degrees of
freedom (from table).
• Computations: t = (x̅−μ)/(s/√n) x̅ = 42, μ = 46 , s = 11.9
and n = 12.
• Hence, t =(42 − 46)/(11.9/√12)= −1.16,
• Decision: Do not reject H0 and conclude that the
average number of kilowatt hours used annually by
home vacuum cleaners is not significantly less than 46.
Testing the Difference Between Means
(Small Independent Samples)
Two Sample t-Test
If samples of size less than 30 are taken from normally-distributed populations, a t-test
may be used to test the difference between the population means μ1 and μ2.

Three conditions are necessary to use a t-test for small independent samples.

1. The samples must be randomly selected.


2. The samples must be independent. Two samples are independent if the
sample selected from one population is not related to the sample selected
from the second population.
3. Each population must have a normal distribution.
Two Sample t-Test
Two-Sample t-Test for the Difference Between Means
A two-sample t-test is used to test the difference between two
population means μ1 and μ2 when a sample is randomly selected from
each population. Performing this test requires each population to be
normally distributed, and the samples should be independent. The
standardized test statistic is

t
x1  x 2  μ1  μ2 .
σ x x 1 2

If the population variances are equal, then information from the two
samples is combined to calculate a pooled estimate of the standard
deviation σˆ can be calculated as follows

σˆ 
n1  1 s12  n2  1 s22
n1  n2  2 Continued.
Two Sample t-Test
Two-Sample t-Test (Continued)
The standard error for the sampling distribution of x1  x 2 is

σ x  x  σˆ  1  1
1 2
n1 n2 Variances equal

and d.f.= n1 + n2 – 2.

If the population variances are not equal, then the standard error is
s12 s 22
σ x x   Variances not equal
1 2
n1 n2

and d.f = smaller of n1 – 1 or n2 – 1.


Two Sample t-Test for the Means
Using a Two-Sample t-Test for the Difference Between Means (Small Independent
Samples)

In Words In Symbols

1. State the claim mathematically. State H0 and H1.


Identify the null and alternative
hypotheses.
2. Specify the level of significance. Identify .
3. Identify the degrees of freedom d.f. = n1+ n2 – 2 or
and sketch the sampling d.f. = smaller of n1 – 1
distribution. or n2 – 1.

4. Determine the critical value(s). Use Table

Continued.
Two Sample t-Test for the Means
Using a Two-Sample t-Test for the Difference Between Means (Small Independent
Samples)

In Words In Symbols

5. Determine the rejection regions(s).

6. Find the standardized test statistic. t


x1  x 2  μ1  μ2
σ x x
1 2

7. Make a decision to reject or fail to If t is in the rejection


reject the null hypothesis. region, reject H0.
Otherwise, fail to reject
8. Interpret the decision in the
H0.
context of the original claim.
Two Sample t-Test for the Means
Example:
A random sample of 17 police officers in Brownsville has a mean
annual income of $35,800 and a standard deviation of $7,800. In
Greensville, a random sample of 18 police officers has a mean
annual income of $35,100 and a standard deviation of $7,375. Test
the claim at  = 0.01 that the mean annual incomes in the two
cities are not the same. Assume the population variances are
equal.
H0: 1 = 2
 = 0.005  = 0.005
Ha: 1  2 (Claim)

t
d.f. = n1 + n2 – 2 -3 -2 -1 0 1 2 3
t0 = 2.576
–t0 = –2.576
= 17 + 18 – 2 = 33 Continued.
Two Sample t-Test for the Means
Example continued:

H0: 1 = 2

Ha: 1  2 (Claim) t
-3 -2 -1 0 1 2 3
–t0 = –2.576 t0 = 2.576
The standardized error is

σ x  x  σˆ 1

1

n1  1 s12  n2  1 s22  1

1
1 2
n1 n2 n1  n2  2 n1 n2


17  1 78002  18  1 73752  1 1

17  18  2 17 18

 7584.0355(0.3382)

 2564.92 Continued.
Two Sample t-Test for the Means
Example continued:

H0: 1 = 2

Ha: 1  2 (Claim) t
-3 -2 -1 0 1 2 3
–t0 = –2.576 t0 = 2.576

The standardized test statistic is

t
x1  x 2  μ1  μ2 35800  35100  0
σ   0.273
x x
1 2
2564.92
Fail to reject H0.

There is not enough evidence at the 1% level to support the claim that the mean annual
incomes differ.
Normal or t-Distribution?
Are both sample sizes at least
Yes Use the z-test.
30?
No

Are both populations normally You cannot use the z-


No
distributed? test or the t-test.
Yes Use the t-test with
and d.f = n1 + n2 –
Are both population standard Are the population 2.
No Yes
deviations known? variances equal?
σ x  x  σˆ 1  1
1 2
n1 n2
Yes No

Use the z-test. Use the t-test with

and d.f = smaller of n1 – 1 or n2 –


1.
s12 s 22
σ x x  
1 2
n1 n2
Chi-square-distribution
• To study the variability of population, sampling
distribution of S2 will be used in learning about the
parametric counterpart, the population variance σ2

• If a random sample of size n is drawn from a normal


population with mean μ and variance σ2, and the sample
variance is computed, we obtain a value of the statistic
S2.

• (n − 1)S2/σ2 is a Chi-square random variable with degree


of freedom n-1
Chi-Square distribution

The probability that a random sample produces a χ2 value


greater than some specified value is equal to the area under
the curve to the right of this value.
Chi-Square test
• A manufacturer of car batteries claims that the life of the
company’s batteries is approximately normally distributed
with a standard deviation equal to 0.9 year. If a random
sample of 10 of these batteries has a standard deviation of 1.2
years, do you think that σ > 0.9 year? Use a 0.05 level of
significance.
• H0: σ2 = 0.81.
• H1: σ2 > 0.81.
• α = 0.05.
• Critical region: From Figure we see that the null hypothesis is
rejected when χ2 > 16.919, χ2 = (n−1)s2/σ02
• Computations: s2 = 1.44 (as σ0 =1.2 given), n = 10, and
• χ2 =(9)(1.44)/0.81= 16.0

• Decision: The χ2-statistic is not significant at the 0.05 level.


Multinomial Experiments
A multinomial experiment is a probability experiment consisting of
a fixed number of trials in which there are more than two possible
outcomes for each independent trial. (Unlike the binomial
experiment in which there were only two possible outcomes.)

Example:
A researcher claims that the distribution of favorite pizza toppings
among teenagers is as shown below.

Topping Frequency, f
Each outcome is Cheese 41% The probability for
classified into Pepperoni 25% each possible
categories. Sausage 15% outcome is fixed.
Mushrooms 10%
Onions 9%
Chi-Square Goodness-of-Fit Test
A Chi-Square Goodness-of-Fit Test is used to test whether an observed
frequency distribution fits an expected distribution.
To calculate the test statistic for the chi-square goodness-of-fit test, the
observed frequencies and the expected frequencies are used.

The observed frequency O of a category is the frequency for the category


observed in the sample data.
The expected frequency E of a category is the calculated frequency for the
category. Expected frequencies are obtained assuming the specified (or
hypothesized) distribution. The expected frequency for the ith category is
Ei = npi
where n is the number of trials (the sample size) and pi is the assumed
probability of the ith category.
Observed and Expected Frequencies
Example:
200 teenagers are randomly selected and asked what their favorite
pizza topping is. The results are shown below. Find the observed
frequencies and the expected frequencies.

Topping Observed Expected Observed Expected


Results % of Frequency Frequency
(n = 200) teenagers 78 200(0.41) = 82
Cheese 78 41% 52 200(0.25) = 50
Pepperoni 52 25% 30 200(0.15) = 30
Sausage 30 15% 25 200(0.10) = 20
Mushrooms 25 10% 15 200(0.09) = 18
Onions 15 9%
Chi-Square Goodness-of-Fit Test
For the chi-square goodness-of-fit test to be used, the following must be true.

1. The observed frequencies must be obtained by using a random sample.


2. Each expected frequency must be greater than or equal to 5.

The Chi-Square Goodness-of-Fit Test


If the conditions listed above are satisfied, then the sampling
distribution for the goodness-of-fit test is approximated by a chi-
square distribution with k – 1 degrees of freedom, where k is the
number of categories. The test statistic for the chi-square goodness-of-
fit test is 2
2 k (Oi  Ei ) The test is always a right-
χ  
i 1 Ei tailed test.
where Oi represents the observed frequency of ith category and Ei
represents the expected frequency of ith category.
Chi-Square Goodness-of-Fit Test
Performing a Chi-Square Goodness-of-Fit Test

In Words In Symbols

1. Identify the claim. State the null State H0 and Ha.


and alternative hypotheses.

2. Specify the level of significance. Identify .

3. Identify the degrees of freedom. d.f. = k – 1

4. Determine the critical value. Use Table

5. Determine the rejection region.

Continued.
Chi-Square Goodness-of-Fit Test
Performing a Chi-Square Goodness-of-Fit Test
In Words In Symbols
2 k (Oi  Ei )2
6. Calculate the test statistic. χ  
i 1 Ei

7. Make a decision to reject or fail If χ2 is in the rejection


to reject the null hypothesis. region, reject H0.
Otherwise, fail to
reject H0.
8. Interpret the decision in the
context of the original claim.
Chi-Square Goodness-of-Fit Test
Example:
A surveyor did a serve regarding pizza topping of 200 randomly selected teenagers. He
finds the statistics as shown below. Does it differ from the expected frequency?

Topping Observed
Frequency, f
Cheese 39%
Pepperoni 26%
Sausage 15%
Mushrooms 12.5%
Onions 7.5%
Using  = 0.01, and the observed and expected values previously calculated, test the
surveyor’s claim using a chi-square goodness-of-fit test.
Continued.
Chi-Square Goodness-of-Fit Test
Example continued:

H0: observed and expected frequency does not differ. (Claim)

Ha: observed frequency differs from expected frequency.

Because there are 5 categories, the chi-square distribution has k – 1 = 5 – 1 = 4 degrees of


freedom.

With d.f. = 4 and  = 0.01, the critical value is χ20 = 13.277.

Continued.
Chi-Square Goodness-of-Fit Test
Example continued:
Topping Observed Expected
Rejection
Frequency Frequency
region
Cheese 78 82
  0.01 Pepperoni 52 50
Sausage 30 30
X2
Mushrooms 25 20
χ20 = 13.277 Onions 15 18

2 k (Oi  Ei )2  (78  82)2  (52  50)2  (30  30)2  (25  20)2  (15  18)2
χ  
82 50 30 20 18
i 1 Ei
 2.025
Fail to reject H0.
There is not enough evidence at the 1% level to reject the surveyor’s claim.
Another Example
• In a study of vehicle ownership, it has been found that 13.5% of
U.S. households do not own a vehicle, with 33.7% owning 1
vehicle, 33.5% owning 2 vehicles, and 19.3% owning 3 or more
vehicles. The data for a random sample of 100 households in a
resort community are summarized below. At the 0.05 level of
significance, can we reject the possibility that the vehicle-
ownership distribution in this community differs from that of the
nation as a whole?
# Vehicles Owned # Households
0 20
1 35
2 23
3 or more 22
Goodness-of-Fit: An Example
I
H0: Observed distribution in this community is the same as it is
in the nation as a whole.
H1: Vehicle-ownership distribution in this community is not the
same as it is in the nation as a whole.
# Vehicles Oj Ej [Oj– Ej ]2/ Ej
0 20 13.5 3.1296
1 35 33.7 0.0501
2 23 33.5 3.2910
3+ 22 19.3 0.3777
Sum = 6.8484
Goodness-of-Fit: An Example
II. Rejection Region:
 = 0.05
df = k – 1 = 4 – 1 = 3
Do Not Reject H0 Reject H0
III. Test Statistic: 0.95 0.05

c2 = 6.8484 c 2=7.815

IV. Conclusion: Since the test statistic of c2 = 6.8484 falls below the
critical value of c2 = 7.815, we do not reject H0 with at least 95%
confidence.
V. Implications: There is not enough evidence to show that vehicle
ownership in this community differs from that in the nation as a
whole.
Independence using Chi-
square test
Chi-Square Independence Test
A chi-square independence test is used to test the independence
of two variables. Using a chi-square test, you can determine
whether the occurrence of one variable affects the probability of
the occurrence of the other variable.
Contingency Tables
An r  c contingency table shows the observed frequencies for two
variables. The observed frequencies are arranged in r rows and c
columns. The intersection of a row and a column is called a cell.

The following contingency table shows a random sample of 321 fatally


injured passenger vehicle drivers by age and gender. (Adapted from
Insurance Institute for Highway Safety)

Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and older
Male 32 51 52 43 28 10
Female 13 22 33 21 10 6
An Integrated Definition of
Independence
• From basic probability:
If two events are independent
P(A and B) = P(A) • P(B)

• In the Chi-Square Test of Independence:


If two variables are independent
P(rowi and columnj) = P(rowi) • P(columnj)
Chi-Square Tests of Independence
• Calculating expected values
E  P(row and column )n  P(row ) P(column )n
ij i j i j

# elements in row # elements in column j


 i n
n n

Cancelling two factors of n,


(# elements in row )  (# elements in column )
E  i
n
j
ij
Expected Frequency
Example:
Find the expected frequency for each cell in the contingency table for the
sample of 321 fatally injured drivers. Assume that the variables, age and
gender, are independent.
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
Female 13 22 33 21 10 6 105
Total 45 73 85 64 38 16 321

Continued.
Expected Frequency
Example continued:
Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
Female 13 22 33 21 10 6 105
Total 45 73 85 64 38 16 321
(Sum of row r )  (Sum of column c )
Expected frequency E r ,c 
Sample size

E1,1  216  45  30.28 E1,2  216  73  49.12 E1,3  216  85  57.20


321 321 321

E1,4  216  64  43.07 E1,5  216  38  25.57 E1,6  216  16  10.77


321 321 321
Chi-Square Independence Test
For the chi-square independence test to be used, the following must be true.
1. The observed frequencies must be obtained by using a random sample.
2. Each expected frequency must be greater than or equal to 5.
The Chi-Square Independence Test
If the conditions listed are satisfied, then the sampling distribution for
the chi-square independence test is approximated by a chi-square
distribution with
(r – 1)(c – 1)
degrees of freedom, where r and c are the number of rows and
columns, respectively, of a contingency table. The test statistic for the
chi-square independence test is
2 k (Oi  Ei )2 The test is always a right-
χ  
i 1 Ei tailed test.
where Oi represents the observed frequency of ith category and Ei
represents the expected frequency of ith category.
Chi-Square Independence Test
Performing a Chi-Square Independence Test

In Words In Symbols

1. Identify the claim. State the null State H0 and Ha.


and alternative hypotheses.

2. Specify the level of significance. Identify .

3. Identify the degrees of freedom. d.f. = (r – 1)(c – 1)

4. Determine the critical value. Use Table

5. Determine the rejection region.

Continued.
Chi-Square Independence Test
Performing a Chi-Square Independence Test

In Words In Symbols

6. Calculate the test statistic.


2 k (Oi  Ei )2
χ  
i 1 Ei

7. Make a decision to reject or fail If χ2 is in the rejection


to reject the null hypothesis. region, reject H0.
Otherwise, fail to
reject H0.
8. Interpret the decision in the
context of the original claim.
Chi-Square Independence Test
Example:
The following contingency table shows a random sample of 321 fatally injured passenger
vehicle drivers by age and gender. The expected frequencies are displayed in
parentheses. At  = 0.05, can you conclude that the drivers’ ages are related to gender
in such accidents?

Age
Gender 16 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 and Total
older
Male 32 51 52 43 28 10 216
(30.28) (49.12) (57.20) (43.07) (25.57) (10.77)
Female 13 22 33 21 10 6 (5.23) 105
(14.72) (23.88) (27.80) (20.93) (12.43)
45 73 85 64 38 16 321
Chi-Square Independence Test
Example continued:
Because each expected frequency is at least 5 and the drivers were randomly selected,
the chi-square independence test can be used to test whether the variables are
independent.

H0: The drivers’ ages are independent of gender.

Ha: The drivers’ ages are dependent on gender. (Claim)

d.f. = (r – 1)(c – 1) = (2 – 1)(6 – 1) = (1)(5) = 5

With d.f. = 5 and  = 0.05, the critical value is χ20 = 11.071.

Continued.
Chi-Square Independence Test
Example continued:
O E O–E (O – E)2 (O  E )2
Rejection E
32 30.28 1.72 2.9584 0.0977
region
51 49.12 1.88 3.5344 0.072
  0.05 52 57.20 5.2 27.04 0.4727
43 43.07 0.07 0.0049 0.0001
X2 28 25.57 2.43 5.9049 0.2309
χ20 = 11.071
10 10.77 0.77 0.5929 0.0551
13 14.72 1.72 2.9584 0.201
(O  E )2
22 23.88 1.88 3.5344 0.148
2
χ   2.84 33 27.80 5.2 27.04 0.9727
E
21 20.93 0.07 0.0049 0.0002
Fail to reject H0. 10 12.43 2.43 5.9049 0.4751
6 5.23 0.77 0.5929 0.1134
There is not enough evidence at the 5% level to conclude that age is dependent on
gender in such accidents.

You might also like