Hypothesis Testing
Hypothesis Testing
HYPOTHSIS TESTING
Cohen’s d = mean
Power of a Hypothesis
Test
The power of a hypothesis test is defined is
the probability that the test will reject the
null hypothesis when the treatment does
have an effect. The power of a test
depends on a variety of factors including:
Sample size: the size of the treatment
effect and the size of the sample are
related. Larger the sample produces
greater power for hypothesis test.
Alpha Level: Reducing the alpha level of
the test also reduce the power of the test.
Lowering from .05 to .01 lower the power of
hypothesis test.
Critical region on the right-hand side begins at
z = 1.96. If a were changed to .01, the
boundary would be moved farther to the right,
out to z = 2.58.
It should be clear that moving the critical
boundary to the right means that a smaller
portion of the treatment distribution (the
distribution on the right-hand side) will be in
the critical region. Thus, there would be a lower
probability of rejecting the null hypothesis and
a lower value for the power of the test.
One tailed vs. Two tailed test: If the
treatment effect is in the predicted direction,
then changing from a regular two-tailed test to
a one-tailed test increases the power of the
hypothesis test.
The boundaries for the critical region using a
two-tailed test with α = .05 so that the critical
region on the right-hand side begins at z =
1.96. Changing to a one-tailed test would move
the critical boundary to the left to a value of z
= 1.65.
Moving the boundary to the left would cause a
larger proportion of the treatment distribution
HYPOTHESIS TESTING
THROUGH Z-SCORE
Alpha Levels for Hypothesis
test through Z-Test
If we revisit the z-score for 5% and 1%, we can
identify the critical regions for the critical
rejection areas from the unit standard normal
table.
A two-tailed test at the 5% level has a critical
boundary Z score of +1.96 and -1.96
A one-tailed test at the 5% level has a critical
boundary Z score of +1.64 or -1.64
A two-tailed test at the 1% level has a critical
boundary Z score of +2.58 and -2.58
A one-tailed test at the 1% level has a critical
boundary Z score of +2.33 or -2.33.
Example: 1
We start with a normal shaped population
with a mean of μ = 80 and a standard
deviation of σ = 10. A researcher plans to
select a sample of n = 25 individuals from
this population and administer a treatment
to each individual. It is expected that the
treatment will have an 8-point effect; that
is, the treatment will add 8 points to each
individual’s score.
Step 1 : State the Hypothesis
H0 : μ = 80
H1: μ > 80
Step 2: Select Alpha Level
Step 3: Calculate the test statistics
z=M–μ
σM
z = 80-88/2 = -4.0
Cohen’s d = 0.8
Step 4 : Make a Decision
Sample means will be in the critical region and
we will reject the null hypothesis. In practical
terms, this means that the research study is
almost guaranteed to be successful. If the
researcher selects a sample of n = 25
individuals, and if the treatment really does have
Example: 2
A study examines self-esteem and
depression in teenagers. A sample of 25
teens with a low self-esteem are given the
Beck Depression Inventory. The average
score for the group is 20.9. For the general
population, the average score is 18.3 with σ
= 12. Use a two-tail test with α = 0.05 to
examine whether teenagers with low self-
esteem show significant differences in
depression.
Answer: we will accept the null hypothesis
with 0.22 Effect size
Example: 3
You get hired as a server at a local restaurant,
and the manager tells you that servers’ tips are
$42 on average but vary about $12 (μ = 42, σ
= 12). You decide to track your tips to see if
you make a different amount, but because this
is your first job as a server, you don’t know if
you will make more or less in tips. After working
16 shifts, you find that your average nightly
amount is $44.50 from tips. Test for a difference
between this value and the population mean at
the α = 0.05 level of significance while
computing effect size as well
Answer: we will accept the null hypothesis with
Example: 4
A researcher begins with a known
population—in this case, scores on a
standardized test that are normally
distributed with μ = 65 and σ = 15. The
researcher suspects that special training in
reading skills will produce a change in the
scores for the individuals in the population.
Because it is not feasible to administer the
treatment (the special training) to everyone
in the population, a sample of n = 25
individuals is selected, and the treatment is
given to this sample. Following treatment,
the average score for this sample is M =
70. Is there evidence that the training has
Example: 5
The psychology department is gradually
changing its curriculum by increasing the
number of online course offerings. To evaluate
the effectiveness of this change, a random
sample of n = 36 students who registered for
Introductory Psychology is placed in the online
version of the course. At the end of the
semester, all students take the same final exam.
The average score for the sample is M = 76. For
the general population of students taking the
traditional lecture class, the final exam scores
form a normal distribution with a mean of μ =
71. If the population standard deviation is σ =
Example: 6
A random sample of n = 25 scores is selected
from a normal population with a mean of μ = 40.
After a treatment is administered to the
individuals in the sample, the sample mean is
found to be M = 44.
a. If the population standard deviation is σ = 5, is
the sample mean sufficient to conclude that the
treatment has a significant effect? Use a two-
tailed test with α = .05 and compute effect size.
(Reject Null Hypothesis) (Cohe’s d = 0.80)
b. If the population standard deviation is σ = 15,
is the sample mean sufficient to conclude that the
treatment has a significant effect? Use a two-