0% found this document useful (0 votes)
2 views

Workbook.hypothesis testing

The document outlines various scenarios for hypothesis testing in inferential statistics, including setting up hypothesis statements for different situations such as pain reliever effectiveness, smoking cessation attempts, manufacturing quality control, and marketing studies. It also discusses significance levels, Type I and II errors, test statistics for one- and two-tailed tests, p-values, and confidence intervals for the difference of means. Each section provides examples and calculations relevant to the hypothesis testing process.

Uploaded by

kart238
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Workbook.hypothesis testing

The document outlines various scenarios for hypothesis testing in inferential statistics, including setting up hypothesis statements for different situations such as pain reliever effectiveness, smoking cessation attempts, manufacturing quality control, and marketing studies. It also discusses significance levels, Type I and II errors, test statistics for one- and two-tailed tests, p-values, and confidence intervals for the difference of means. Each section provides examples and calculations relevant to the hypothesis testing process.

Uploaded by

kart238
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Hypothesis testing

INFERENTIAL STATISTICS AND HYPOTHESES

1. A current pain reliever has an 85 % success rate of treating pain. A


company develops a new pain reliever and wants to show that its success
rate of treating pain is better than the current option. Decide if the
hypothesis statement would require a population proportion or a
population mean, then set up the statistical hypothesis statements for the
situation.

2. A research study on people who quit smoking wants to show that the
average number of attempts to quit before a smoker is successful is less
than 3.5 attempts. How should they set up their hypothesis statements?

3. A factory creates a small metal cylindrical part that later becomes part
of a car engine. Because of variations in the process of manufacturing, the
diameters are not always identical. The machine was calibrated to create
cylinders with an average diameter of 1/16 of an inch. During a periodic
inspection, it became clear that further investigation was needed to
determine whether or not the machine responsible for making the part
needed recalibration. Write statistical hypothesis statements.

4. A marketing study for a clothing company concluded that the mean


percentage increase in sales could potentially be over 17 % for creating a

1
clothing line that focused on lime green and polka dots. Which hypothesis
statements do they need to write in order to test their theory?

5. A food company wants to ensure that less than 0.0001 % of its product
is contaminated. Which hypothesis statements will it write if it wants to
test for this?

6. A new medication is being developed to prevent heart worms in dogs,


and the developer wants it to work better than the current medication.
The current medication prevents heart worms at a rate of 75 % . What
hypothesis statements should they write if they want to test whether or
not the new medication works better than the existing one?

2
SIGNIFICANCE LEVEL AND TYPE I AND II ERRORS

1. We’re running a statistical test on a new pharmaceutical drug. The


stakes are high, because the side effects of the drug could potentially be
serious, or even fatal. If we want to reduce the Type I and Type II error
rates as low as possible to avoid rejecting the null when it’s true or
accepting the null when it’s false, what should we do when we take the
sample?

2. If the probability of making a Type II error in a statistical test is 5 % ,


what is the power of the test?

3. On average, professional golfers make 75 % of putts within 5 feet. One


golfer believes he does better than this, and wants to use a statistical test
to see whether or not he’s correct. Unbeknownst to him, in actuality this
golfer makes 7 out of 10 of these kinds of putts. When he takes a sample
of his putts, he finds p̂ = 0.92. What kind of error might he be in danger of
making?

4. The average age of a guest at an amusement park is 15 years old. One


amusement park believes the average age of their guests is younger than
this, and wants to use a statistical test to see whether or not they’re
correct. Unbeknownst to them, in actuality the average guest age at this

3
particular amusement park is 12 years old. When they take a sample of his
guests, they find x̄ = 16 years. What kind of error might they be in danger
of making?

5. Of all political donations, 70 % come from corporations and lobbies,


not from individual citizens. One politician believes he receives less than
70 % of his own donations from corporations and lobbies, and wants to use
a statistical test to see whether or not he’s correct. Unbeknownst to him,
in actuality the proportion of his donations that come from corporations
and lobbies is 65 % . When he takes a sample of his donations that come
from corporations and lobbies, he finds p̂ = 0.72. What kind of error might
he be in danger of making?

6. A coffee shop owner believes that he sells 500 cups of coffee each
day, on average, and he wants to test this assumption. The truth is, he
actually sells fewer than 500 cups each day. He takes a random sample of
10 days and records the number of cups he sells each of those days. What
kind of error is the coffee shop owner in danger of making?

Day 1 2 3 4 5 6 7 8 9 10

Cups sold 488 502 496 506 492 489 510 511 506 500

4
TEST STATISTICS FOR ONE- AND TWO-TAILED TESTS

1. A local high school states that its students perform much better than
average on a state exam. The average score for all high school students in
the state is 106 points. A sample of 256 students at this particular school
had an average test score of 129 points with a sample standard deviation
of 26.8. Choose and calculate the appropriate test statistic.

2. A dietician is looking into the claim at a local restaurant that the


number of calories in its portion sizes is lower than the national average.
The national average is 1,500 calories per meal. She samples 35 meals at
the restaurant and finds they contain an average of 1,250 calories per meal
with a sample standard deviation of 350.2. Choose and calculate the
appropriate test statistic.

3. In a recent survey, 567 out of a 768 randomly selected dog owners said
they used a kennel that was run by their veterinary office to board their
dogs while they were away on vacation. The study would like to make a
conclusion that the majority (more than 50 % ) of dog owners use a kennel
run by their veterinary office when the owners go on vacation. Choose and
calculate the appropriate test statistic.

5
4. We want to open a day care center, so we take a random sample of
500 households in our town with children under preschool age, and find
that 243 of them were using a family member to care for those children.
We want to determine if, at a statistically significant level, fewer than half
of households in our town are using a family member to care for the kids.

1. Set up the hypothesis statements.

2. Check that the conditions for normality are met.

3. State the type of test: upper-tailed, lower-tailed, or two-tailed.

4. Calculate the test statistic using the appropriate formula.

5. The highest allowable amount of bromate in drinking water is


0.0100 mg/L2. A survey of a city’s water quality took 50 water samples in
random locations around the city and found an average of 0.0102 mg/L2 of
bromate with a sample standard deviation of 0.0025 mg/L. The survey
committee is interested in testing if the amount of bromate found in the
water samples is higher than the allowable amount at a statistically
significant level.

1. Set up the hypothesis statements.

2. Check that the conditions for normality are met.

3. State the type of test: upper-tailed, lower-tailed, or two-tailed.

4. Calculate the test statistic using the appropriate formula.

6
6. A farmer reads a study that states: The average weight of a day-old
chick upon hatching is μ0 = 38.60 grams with a population standard
deviation of σ = 5.7 grams. The farmer wants to see if her day-old chicks
have the same average. She takes a simple random sample of 60 of her
day-old chicks and finds their average weight is x̄ = 39.1 grams.

1. Set up the hypothesis statements.

2. Check that the conditions for normality are met.

3. State the type of test: upper-tailed, lower-tailed, or two-tailed.

4. Calculate the test statistic using the appropriate formula.

7
THE P-VALUE AND REJECTING THE NULL

1. A medical trial is conducted to test whether or not a new medicine


reduces total cholesterol, when the national average is 230 mg/dL with a
standard deviation of 16 mg/dL. The trial takes a simple random sample of
223 adults who take the new medicine, and finds x̄ = 227 mg/dL. What can
the trial conclude at a significance level of α = 0.01?

2. The national average length of pregnancy is 283.6 days with a


population standard deviation of 10.5 days. A hospital wants to know if the
average length of a pregnancy at their hospital deviates from the national
average. They use a sample of 9,411 births at the hospital to calculate a
test statistic of z = − 1.60. Set up the hypothesis statements and find the p
-value.

3. The highest allowable amount of bromate in drinking water is


0.0100 (mg/L)2. A survey of a city’s water quality took 31 water samples in
random locations around the city and used the data to calculate a test
statistic of t = 2.04. The city wants to know if the amount of bromate in
their drinking water is too high. Set up the hypothesis statements and
determine the type of test, then find the p-value.

8
4. A paint company produces glow in the dark paint with an advertised
glow time of 15 min. A painter is interested in finding out if the product
behaves worse than advertised. She sets up her hypothesis statements as
H0 : μ ≥ 15 and Ha : μ < 15, then calculates a test statistic of z = − 2.30. What
would be the conclusions of her hypothesis test at significance levels of
α = 0.05, α = 0.01, and α = 0.001?

5. An article reports that the average wasted time by an employee is 125


minutes every day. A manager takes a small random sample of 16
employees and monitors their wasted time, calculating that average
wasted time for her employees is 118 minutes with a standard deviation of
28.7 minutes. She wants to know if 118 minutes is below average at a
significance level of α = 0.05. She assumes the population is normally
distributed.

1. State the population parameter and whether a t-test or z-test


should be used.

2. Check that the conditions for performing the statistical test are
met.

3. Set up the hypothesis statements.

4. State the type of test: upper-tailed, lower-tailed, or two-tailed.

5. Calculate the test statistic using the appropriate formula.

6. Calculate the p-value.

9
7. Compare the p-value to the significance level and draw a
conclusion.

6. We want to test if college students take fewer than than 5 years to


graduate, on average, so we take a simple random sample of 30 students
and record their years to graduate. For the sample, x̄ = 4.9 and s = 0.5.
What can we conclude at 90 % confidence?

10
HYPOTHESIS TESTING FOR THE POPULATION PROPORTION

1. A large electric company claims that at least 80 % of the company’s


1,000,000 customers are very satisfied. Using a simple random sample, 100
customers were surveyed and 73 % of the participants were very satisfied.
Based on these results, should we use a one- or two- tailed test, and
should we accept or reject the company’s hypothesis? Assume a
significance level of 0.05.

2. A university is conducting a statistical test to determine whether the


percentage of its students who live on its campus is above the national
average of 64 % . They’ve calculated the test statistic to be z = 1.40. Set up
hypothesis statements and find the p-value.

3. A report claims that 60 % of American families take fewer than 6


months to purchase a home, from the time they start looking to the time
they make their first offer. A realtor wants to know if her clients purchase
at the same rate, so she takes a simple random sample of 50 of her clients
and finds p̂ = 0.64 and σp̂ = 0.0048 from the sample. What can she
conclude with 90 % confidence?

4. A gambler wins 48 % of the hands he plays, but he feels like he’s on a


losing streak recently, winning fewer hands than normal. He takes a

11
random sample of 40 of his recent hands, and finds the proportion of
winning hands in the sample to be p̂ = 0.45 with σp̂ = 0.00624. What can he
conclude with 90 % confidence?

5. A study claims that the proportion of new homeowners who purchase


an internet subscription plan is 0.92. We take a random sample of 140 new
homeowners to test this claim, and find p̂ = 0.9 with σp̂ ≈ 0.0229. What can
we conclude at a significance level of α = 0.05?

6. A recent study reported that the 15.3 % of patients who are admitted
to the hospital with a heart attack die within 30 days of admission. The
same study reported that 16.7 % of the 3,153 patients who went to the
hospital with a heart attack died within 30 days of admission when the lead
cardiologist was away.

Is there enough evidence to conclude that the percentage of patients who


die when the lead cardiologist is away is any different than when they’re
present? Make conclusions at significance levels of α = 0.05 and α = 0.01.

1. State the population parameter and whether a t-test or z-test


should be used.

2. Check that the conditions for performing the statistical test are
met.

3. Set up the hypothesis statements.

12
4. State the type of test: upper-tailed, lower-tailed, or two-tailed.

5. Calculate the test statistic using the appropriate formula.

6. Calculate the p-value.

7. Compare the p-value to the significance level and draw a


conclusion.

13
CONFIDENCE INTERVAL FOR THE DIFFERENCE OF MEANS

1. A researcher wants to compare the effectiveness of new blood


pressure medication for males and females. He takes a simple random
sample of 25 males and 25 females and finds an average drop in blood
pressure of 4.5 with a standard deviation of 0.35 for males, and an average
drop in blood pressure of 4.85 with a standard deviation of 0.22 for females.
Can he use pooled standard deviation to find the confidence interval?

2. A grocery store wants to know whether families of 3 spend more on


groceries than families of 2. They randomly survey ten 3-person families
and find a mean weekly grocery spend of $258 with a standard deviation of
$22, then randomly survey ten 2-person families and find a mean weekly
grocery spend of $252 with a standard deviation of $26. Calculate the
number of degrees of freedom.

3. For the last question, calculate a 95 % confidence interval around the


difference in mean weekly grocery spending for 3-and 2-person families.

4. A researcher is interested in whether a new fitness program lowers


systolic blood pressure. He enrolls 50 participants into the study and
randomly splits them into two groups of 25 each. The first group kept their
same physical activity habits, while the second group followed the new

14
fitness program. After a month, the mean systolic blood pressure in the
group of exercisers was 123 with standard deviation of 4, and the mean
systolic pressure in the group of non-exercisers was 131 with a standard
deviation of 5.5. Calculate the margin of error at 99 % confidence.

5. Given population standard deviations σ1 = 2.25 and σ2 = 2.02, with


sample means x̄1 = 14.5 and x̄2 = 13.6 and sample sizes n1 = 250 and n2 = 250,
calculate a 90 % confidence interval around the difference of means.

6. Owners of a large shopping center want to determine whether or not


there’s a difference in the amount of time that men and women spend per
visit to the shopping center. Previous studies showed a standard deviation
of 0.4 hours for men and 0.2 hours for women. The owners sample 500 men
and 500 women and find that the mean time spent per visit was 1.6 hours
for men and 2.5 hours for women. Find a 98 % confidence interval around
the difference of means.

15
HYPOTHESIS TESTING FOR THE DIFFERENCE OF MEANS

1. An ice cream shop owner believes his average daily revenue is higher
in August than it is in September. He calculated average daily revenue of
$496 in August and $456 in September, with standard deviations of $14 and
$21.5, respectively. What can he conclude at a 0.05 significance level using
a p-value approach.

2. A fitness coach wants to determine whether his new weight loss


program is more effective than his old program. He randomly samples 50
of his clients following each program, and finds a mean weight loss of 5.5
pounds with a standard deviation of 1.05 pounds for those following the
old program, and a mean weight loss of 6.12 pounds with a standard
deviation of 0.95 pounds for those following the new program. Using a
critical value approach, what can the coach conclude at a 0.01 level of
significance?

3. Test the claim that, in 2006, the mean weight of men in the US was not
significantly different from the mean weight of women. Previous research
showed population standard deviations were 10.25 pounds for men and
8.58 pounds for women. A random sample of 1,500 men has a mean weight
of 193.5 pounds and a random sample of 1,500 women has a mean weight
of 185.3 pounds. Assuming the population variances are unequal, use a p
-value approach to formulate a decision at the 0.05 significance level.

16
4. A research team wants to determine whether men and women drink a
different amount of water each day. They randomly sample 25 men and 25
women and find that the men consumed 1.48 liters of water with a
standard deviation of 0.13 liters, and that the women consumed 1.62 liters
of water with a standard deviation of 0.20 liters. Using a critical value
approach, what can the research team conclude at a 0.10 level of
significance?

5. Given x̄1 = 23.55 and x̄2 = 20.12 with s1 = 2.3, s2 = 2.9, n1 = 10, and n2 = 15,
determine whether the two population means differ significantly. Using a
critical value approach, and assuming population standard deviations are
unequal, what can we conclude at a 0.01 level of significance?

6. John claims that the temperature in July is higher than the


temperature in August. He recorded the temperature daily at 12 : 00 p.m.
throughout July and August. He found a mean temperature of 28.4∘ C with
a standard deviation of 2.1∘ C in July, and a mean temperature of 27.3∘ C
with a standard deviation of 1.7∘ C in in August. Using a critical value
approach and assuming the population variances are unequal, what can
John conclude at a 0.05 level of significance?

17
MATCHED-PAIR HYPOTHESIS TESTING

1. A golf club manufacturer claims that their new driver delivers 15 yards
of extra driving distance. They record the before and after driving
distances of 10 top professional players.

Player 1 2 3 4 5 6 7 8 9 10

Before x1 303 308 295 305 301 312 287 294 300 301

After x2 307 320 297 315 305 316 299 302 307 315

Difference, d 4 12 2 10 4 4 12 8 7 14

d2 16 144 4 100 16 16 144 64 49 196

Can the manufacturer conclude at a 5 % significance level that their driver


delivers 15 yards of extra driving distance?

2. A car company believes that the changes they’ve made to their hybrid
engine will increase miles per gallon by 4. They send out one car with the
old engine and one car with the new engine to drive the same route, and
record the miles per gallon of each pair of cars.

Route 1 2 3 4 5 6 7 8 9 10

Old engine 39 39 38 42 44 43 42 47 47 47

New engine 50 49 45 46 46 41 42 43 43 49

Difference, d 11 10 7 4 2 -2 0 -4 -4 2

d2 121 100 49 16 4 4 0 16 16 4

18
Can the car company conclude at a 1 % significance level that the changes
they’ve made to the hybrid engine deliver 4 extra miles per gallon?

3. We want to test the claim that listening to classical music while


studying makes students complete their homework faster. We ask 10
students to study in silence for the first semester, and study with classical
music for the second semester, then we record the mean number of hours
spent on homework per week in each semester.

Student 1 2 3 4 5 6 7 8 9 10

In silence 14 13 16 21 15 19 11 20 19 16

With music 12 13 15 22 16 19 8 17 18 17

Difference, d -2 0 -1 1 1 0 -3 -3 -1 1

d2 4 0 1 1 1 0 9 9 1 1

Can we conclude at a 10 % significance level that studying with classical


music reduces the number of hours spent per week on homework?

4. A clothing store wants to test the claim that customers who join their
VIP program return less merchandise. They track the mean monthly
merchandise returns of 10 customers for one year before and after joining
the VIP program, then record the mean returns per month.

19
Customer 1 2 3 4 5 6 7 8 9 10

Before VIP 12 55 48 23 97 103 33 44 17 29

After VIP 15 44 35 20 100 97 30 41 24 40

Difference, d 3 -11 -13 -3 3 -6 -3 -3 7 11

d2 9 121 169 9 9 36 9 9 49 121

Can they conclude at a 5 % significance level that joining the VIP program
reduces the amount of merchandise returns?

5. If the mean difference is d¯ = 10 on a sample of n = 25 with sample


¯
standard deviation sd = 2.5, calculate the 95 % confidence interval around d.

6. If the mean difference is d¯ = 24 on a sample of n = 49 with population


¯
standard deviation σd = 3.2, calculate the 99 % confidence interval around d.

20
CONFIDENCE INTERVAL FOR THE DIFFERENCE OF PROPORTIONS

1. Given x1 = 54 successes in the first sample n1 = 150, and x2 = 47


successes in the second sample n2 = 160, calculate a 95 % confidence
interval.

2. A light bulb manufacturer wants to know whether their own bulbs last
longer than a competitor’s bulb. They randomly sampled 150 people who
bought their bulb, and 72 of them reported that it lasted longer than 250
days. They randomly sampled 150 people who bought the competitor’s
bulb, and 69 of them reported that it lasted for more than 250 days. Find a
90 % confidence interval around the difference of proportions.

3. A research team wants to know whether Vitamin C shortens recovery


time from the common cold. They chose 100 patients with the common
cold and randomly assigned 50 of them to the Vitamin C treatment group
and 50 of them to the placebo group. In the Vitamin C group, 38 patients
recovered in less than 7 days, while 24 patients in the placebo group
recovered in less than 7 days. Find a 99 % confidence interval around the
difference in population proportions.

4. A researcher randomly chose 900 smokers, 450 men and 450 women.
He found that 357 of the male smokers have been diagnosed with coronary

21
artery disease, while 295 of the female smokers have been diagnosed with
coronary artery disease. Construct a 95 % confidence interval to estimate
the difference between the proportions of male and female smokers who
have been diagnosed with coronary artery disease.

5. In a simple random sample of 1,000 people aged 20 − 24, 7 % said they


ran at least one marathon in the last year. In a simple random sample of
1,200 people aged 25 − 29, 12 % said they ran at least one marathon in the
last year. Find a 99 % confidence interval around the difference of
population proportions.

6. In a simple random sample of 280 Masters students from one


university, 24 said they planned to pursue a PhD. In a simple random
sample of 350 Masters students at a second university, 34 said they
planned to pursue a PhD. Build a 98 % confidence interval around the
difference of proportions.

22
HYPOTHESIS TESTING FOR THE DIFFERENCE OF PROPORTIONS

1. We defined the hypothesis statements below, and then found sample


proportions of p1̂ = 0.456 for n1 = 278 and p2̂ = 0.384 for n2 = 310. Using a
critical value approach, can we reject the null hypothesis at a confidence
level of 95 % ?

H0 : p1 − p2 ≤ 0

Ha : p1 − p2 > 0

2. Given the hypothesis statements below, x1 = 234 with n1 = 1,150 and


x2 = 327 with n2 = 1,320, calculate the test statistic.

H0 : p1 − p2 = 0

Ha : p1 − p2 ≠ 0

3. A cinema owner wants to know whether there’s a difference in the


number of boys and girls who watched a new movie last week. She
randomly sampled 76 boys and 75 girls and found that 45 boys and 58 girls
watched the movie. What can she conclude about the difference of
proportions at a 99 % confidence level?

23
4. A store owner believes that women spend at least 22 % more in his
store than men. He randomly chooses 64 visitors, 32 men and 32 women,
and finds that 14 men spent more than $100, while 23 women spent more
than $100. Using a p-value approach, what can he conclude at a 90 %
confidence level?

5. In a random sample of 60 people under the age of 30, 14 % said they’re


planning to go hiking next month. In a random sample of 75 people older
than 50, 23 % said they’re planning to go hiking next month. Using a critical
value approach at a 95 % confidence level, is there enough evidence to
conclude that a higher proportion of people over age 50 plan to go hiking
next month than the proportion of people under 30 who plan to go hiking?

6. John and Steven are two fitness trainers who want to compare their
client satisfaction rate. John chose a random sample of 85 clients and
Steven chose a random sample of 72 clients. John found that 89 % of his
clients were satisfied and Steve found that 91 % of his clients were
satisfied. Using a critical value approach at a 95 % confidence level, is there
a significant difference between proportions?

24
25

You might also like