0% found this document useful (0 votes)
17 views

STAT-3247-Hypothesis-testing-concerning-two-samples

Uploaded by

jade.aquino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

STAT-3247-Hypothesis-testing-concerning-two-samples

Uploaded by

jade.aquino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Hypothesis testing concerning two samples

Onneth O. Tejada
STAT 3247 – Advanced Statistics

CENTRAL LUZON STATE UNIVERSITY


DEPARTMENT of
STATISTICS
Learning Outcomes
After completing this chapter, the students must be able to
• Review t-test for two independent samples
• Wilcoxon rank-sum test
• Review t-test for dependent samples
• Wilcoxon signed rank test

2
DEPARTMENT of
STATISTICS

Hypothesis Testing for Two Populations


• It is used when researchers wish to compare two sample means, using
experimental and control groups.
• Main interest is on the difference between the two populations.
• For example, the average lifetimes of two different brands of bus tires
might be compared to see whether there is any difference in tread wear.
Two different brands of fertilizer might be tested to see whether one is
better than the other for growing plants. Or two brands of cough syrup
might be tested to see whether one brand is more effective than the
other.

3
DEPARTMENT of
STATISTICS

• Two samples are independent if the sample values selected from one
population are not related to or somehow paired or matched with the
sample values selected from the other population. If there is some
relationship so that each value in one sample is paired with a
corresponding value in the other sample, the samples are dependent.
Example:
▪ One group of subjects is treated with the cholesterol- reducing drug Lipitor, while a
second and separate group of subjects is given a placebo. These two sample groups
are independent because the individuals in the treatment group are in no way
paired or matched with corresponding members in the placebo group.
▪ The effectiveness of a diet is tested using weights of subjects measured before and
after the diet treatment. Each “before” value is matched with the “after” value
because each before/after pair of measurements comes from the same person.
4
DEPARTMENT of
STATISTICS

Review t-test for two independent samples


• The t-test for two independent samples is used to determine if there is a
statistically significant difference between the means of two independent
groups.
• Main interest is on the difference between 𝜇1 and 𝜇2
Assumptions:
1. Both samples are random samples.
2. The samples must be independent of each other. That is, there can be
no relationship between the subjects in each sample.
3. When the sample sizes are less than 30, the populations must be
normally or approximately normally distributed.

5
DEPARTMENT of
STATISTICS

The null hypothesis is Ho: 𝜇1 = 𝜇2 or H o: 𝜇 1 − 𝜇 2 = 0

Test Statistic Ha Decision Rule

𝑋ത1 − 𝑋ത2 − 𝜇1 − 𝜇2
𝑡𝑐 = μ1 ≠ μ 2 Reject Ho if 𝑡𝑐 ≥ 𝑡𝛼Τ2,𝑛1 +𝑛2 −2
1 1
𝑆𝑝2 +
𝑛1 𝑛2
μ1 > μ 2 Reject Ho if 𝑡𝑐 > 𝑡𝛼,𝑛1 +𝑛2 −2
2 2
𝑛1 − 1 𝑠1 + 𝑛2 − 1 𝑠2
𝑆𝑝2 =
𝑛1 − 1 + 𝑛2 − 1 μ1 < μ 2 Reject Ho if 𝑡𝑐 < -𝑡𝛼,𝑛1 +𝑛2 −2

6
DEPARTMENT of
STATISTICS

Example 1: The mean for the number of weeks 15 New York Times hard-
cover fiction books spent on the bestseller list is 22 weeks. The standard
deviation is 6.17 weeks. The mean for the number of weeks 15 New York
Times hard-cover nonfiction books spent on the list is 28 weeks. The
standard deviation is 13.2 weeks. At 𝛼 = 0.10, can we conclude that there is
a difference in the mean times for the number of weeks the books were on
the bestseller lists?

Solution:
Step 1: State the null (Ho) and alternative (Ha) hypothesis. Identify the claim.
Ho : 𝜇1 = 𝜇2 and Ha : 𝜇1 ≠ 𝜇2 (claim)

7
DEPARTMENT of
STATISTICS

Step 2: Determine the test statistic, critical value and tail of the distribution
where the rejection region is located.
The test to use is t-test since the
standard deviations 𝜎 are unknown. Rejection Acceptance Rejection
Region Region Region
Since the alternative hypothesis is
𝜇1 ≠ 𝜇2 , therefore the tail of the −1.701 0 1.701
distribution is two-tailed test.

Since 𝛼 = 0.10 and the test is a two-tailed test,


the critical value is 𝑡𝛼Τ2,𝑛1 +𝑛2 −2 = 𝑡0.05,28 = 𝟏. 𝟕𝟎𝟏

8
DEPARTMENT of
STATISTICS

Step 3: Formulate the decision rule.


Since the alternative hypothesis is 𝜇1 ≠ 𝜇2 , therefore the decision rule is
to Reject Ho if 𝒕𝒄 ≥ 𝒕𝜶Τ𝟐,𝒏𝟏 +𝒏𝟐 −𝟐 = 𝒕𝟎.𝟎𝟓,𝟐𝟖 = 𝟏. 𝟕𝟎𝟏.

Step 4: Compute the value of the test statistic.


Given: male: 𝑛1 = 15, 𝑋ത1 = 22, 𝑠1 = 6.17
female : 𝑛2 = 15, 𝑋ത2 = 28, 𝑠2 = 13.2
2 𝑛1 −1 𝑠1 2 + 𝑛2 −1 𝑠2 2 15−1 6.172 + 15−1 13.22
𝑆𝑝 = = = 106.15
𝑛1 −1 + 𝑛2 −1 15−1 + 15−1

𝑋ത1 −𝑋ത2 − 𝜇1 −𝜇2 22−28 − 0


𝑡𝑐 = 1 1
= = −𝟏. 𝟓𝟗𝟓
1 1
𝑆𝑝2 + 106.15 +
𝑛1 𝑛2 𝟏𝟓 𝟏𝟓

9
DEPARTMENT of
STATISTICS

Step 5: Make a decision.


The decision is failed to reject the null hypothesis, since the absolute
value of the test value -1.595 is not greater than or equal to critical value
1.701.
Acceptance
Region
Rejection Rejection
Region Region

−1.701 −1.595 0 1.701

Step 6: Draw conclusion


There is no enough evidence to support the claim that there is a difference in
the mean times for the number of weeks the books were on the bestseller lists.

10
DEPARTMENT of
STATISTICS

Example 2: t-test for two independent samples in R

A teacher wants to know which of the two sections has a higher score. The
teacher will use these scores to make recommendations to the principal.
Random samples of students are asked about their scores. Test if the
population mean scores are different for the two sections? Use =0.05.

Section A 81 77 75 74 86 90 62 73 91 98
Section B 89 64 35 68 69 55 37 57 42 49

11
DEPARTMENT of
STATISTICS

section_A <- c(81, 77, 75, 74, 86, 90, 62, 73, 91, 98)
section_B <- c(89, 64, 35, 68, 69, 55, 37, 57, 42, 49)
t.test(section_A, section_B, alternative="two.sided", paired = FALSE,
mu=0, conf.level = 0.95)

Ho: The population mean scores are not different for the two sections
Ha: The population mean scores are different for the two sections
Since P-value (0.001) is less than 0.05 (level of significance), then reject Ho. Therefore, we can conclude that the
population mean scores are different for the two sections.

12
DEPARTMENT of
STATISTICS

Wilcoxon rank-sum test


• The Wilcoxon rank-sum test (also known as the Mann-Whitney U test) is
a non-parametric test used to determine whether there is a significant
difference between the distributions of two independent groups.
• It is often used as an alternative to the two-sample t-test when the
assumptions of normality or equal variances are not met.
When to Use the Wilcoxon Rank-Sum
• The data are ordinal or continuous but not normally distributed.
• You have two independent groups.
• You want to test whether the distributions of the two groups differ.

13
DEPARTMENT of
STATISTICS

Example 1: Two independent random samples of 2nd year students from


Section A and Section B are selected, and the time in minutes it takes each
student to complete a laboratory practicum is recorded, as shown in the
table. At 𝛼 = 0.05, is there a difference in the times it takes the students to
complete the practicum?

Section A 15 18 16 17 13 22 24 17 19 21 26 28
Section B 14 9 16 19 10 12 11 8 15 18 25

14
DEPARTMENT of
STATISTICS

Solution:

Step 1: State the hypotheses and identify the claim.


Ho: There is no difference in the times it takes the students to complete the
practicum.
Ha: There is difference in the times it takes the students to complete the
practicum. (claim)

Step 2: Find the critical value.


Critical values can be found in Standard Normal Table (because the test
statistic is based on the normal distribution).
Since 𝛼 = 0.05 and this test is a two-tailed test, the critical value is ±1.96.

15
DEPARTMENT of
STATISTICS

Step 3: Compute the test value.


a.) Combine the data from the two samples, arrange the combined data in
ascending order, and rank each value. Be sure to indicate the group.
Time 8 9 10 11 12 13 14 15 15 16 16 17
Group B B B B B A B A B A B A
Rank 1 2 3 4 5 6 7 8.5 8.5 10.5 10.5 12.5

Time 17 18 18 19 19 21 22 24 25 26 28
Group A B A A B A A A B A A
Rank 12.4 14.5 14.5 16.5 16.5 18 19 20 21 22 23

16
DEPARTMENT of
STATISTICS

b.) Sum the ranks of the group with the smaller sample size. (Note: If both
groups have the same sample size, either one can be used.) In this case, the
sample size for the Section B is smaller.

c.) Substitute in the formulas to find the test value.


𝑛1 (𝑛1 +𝑛2 +1) 11(11+12+1)
𝜇𝑅 = = = 132
2 2

𝑛1 𝑛2 (𝑛1 +𝑛2 +1) (11)(12) (11+12+1)


𝜎𝑅 = = = 264 = 16.2
12 12

𝑅 − 𝜇𝑅 93 − 132 Note: R is the sum of ranks for smaller


𝑧= = = −𝟐. 𝟒𝟏 sample size. : If both groups have the same
𝜎𝑅 16.2 sample size, either one can be used

17
DEPARTMENT of
STATISTICS

Step 4: Make a decision.


The decision is to reject the null hypothesis, since -2.41 < -1.96.

Step 5: Summarize the results.


There is enough evidence to support the claim that there is a difference in
the times it takes the recruits to complete the course.

18
DEPARTMENT of
STATISTICS

Wilcoxon rank-sum test in R


• To perform a two-sample test with data in stacked form, use the command:
> wilcox.test(values~groups, dataset)

• If the grouping variable has more than two levels then you must specify
which two you want to compare:
> wilcox.test(values~groups, dataset, groups %in% c("Group1",
"Group2"))
• If your data is in unstacked form (with the values for each sample held in
separate variables), use the command:
> wilcox.test(dataset$sample1, dataset$sample2)

19
DEPARTMENT of
STATISTICS

• For a paired test, set the paired argument to T:


> wilcox.test(values~groups, dataset, paired=T)

• To perform a one-tailed test, set the alternative argument to "greater" or


"less":
> wilcox.test(values~groups, dataset, alternative="greater")

• By default, there is no confidence interval included with the output. To


calculate a confidence interval for the population mean (for one-sample
tests) or difference between means (for paired and two-sample tests), set
the conf.int argument to T. The default size for the confidence intervals is
95%, but you can adjust this with the conf.level argument:
> wilcox.test(values~groups, dataset, conf.int=T,
conf.level=0.99)
20
DEPARTMENT of
STATISTICS

Example 2: Wilcoxon rank-sum test using R


Suppose we have two groups of students' test scores from two different
teaching methods:
Method A 85 90 78 92 88 76 84 91
Method B 70 75 80 78 76 74 73 79 72 44

We want to test if there is a significant difference in scores between


the two groups.

21
DEPARTMENT of
STATISTICS

method_A <- c(85, 90, 78, 92, 88, 76, 84, 91)
method_B <- c(70, 75, 80, 78, 76, 74, 73, 79, 72, 77)
wilcox.test(method_A, method_B, alternative = "two.sided”, paired = FALSE,
mu=0, conf.level = 0.95)

Ho: There is no significant difference in scores between the two groups


Ha: There is a significant difference in scores between the two groups

Since P-value (0.004) is less than 0.05 (level of significance), then reject Ho. Therefore, we can conclude that there
is a significant difference in scores between the two groups.

22
DEPARTMENT of
STATISTICS

Review t-test for dependent samples


• The t-test for dependent samples, also known as the paired t-
test or matched t-test, is used to determine whether there is a significant
difference between the means of two related groups.
• This test is commonly applied when the same subjects are measured
under two different conditions or at two different times.
Assumptions:
1. Both samples are random samples.
2. The sample data are dependent.
3. When the sample sizes are less than 30, the populations must be
normally or approximately normally distributed.

23
DEPARTMENT of
STATISTICS

The null hypothesis is Ho: 𝜇1 = 𝜇2 or Ho: 𝜇1 − 𝜇2 = 0


Test Statistic Ha Decision Rule
μ 1 - μ2 ≠ d o Reject Ho if 𝑡 ≥ 𝑡𝛼Τ2,𝑛−1
𝑑ҧ − 𝜇𝑑
𝑡= 𝑠 μ 1 - μ2 > d o Reject Ho if t > 𝑡𝛼,𝑛−1
𝑑
ൗ 𝑛
μ 1 - μ2 < d o Reject Ho if t < - 𝑡𝛼,𝑛−1

Where
2
2 σ 𝑑𝑖
σ 𝑑𝑖 σ 𝑑𝑖 −
ҧ
𝑑= 𝑛 𝑠𝑑 = 𝑛
𝑛−1

24
DEPARTMENT of
STATISTICS

Example 1: As an aid for improving students’ study habits, nine students were
randomly selected to attend a seminar on the importance of education in life.
The table shows the number of hours each student studied per week before
and after the seminar. At 𝛼 = 0.05, did attending the seminar increase the
number of hours the students studied per week?
Student 1 2 3 4 5 6 7 8 9
Before (𝑋1 ) 9 12 6 15 3 18 10 13 7
After (𝑋2 ) 9 17 9 20 2 21 15 22 6
Solution:
Step 1: State the null (Ho) and alternative (Ha) hypothesis. Identify the claim.
Ho : 𝜇1 = 𝜇2 and Ha : 𝜇1 < 𝜇2 (claim)

25
DEPARTMENT of
STATISTICS

Step 2: Determine the test statistic, critical value and tail of the distribution
where the rejection region is located.
The test to use is t-test since the
standard deviations 𝜎 are unknown.
Rejection
Since the alternative hypothesis is Region Acceptance
Region
𝜇1 < 𝜇2 , therefore the tail of the
distribution is left-tailed test. −1.860 0

Since 𝛼 = 0.05 and the test is a left-tailed test,


the critical value is 𝑡𝛼,𝑛−1 = − 𝑡0.05,8 = 𝟏. 𝟖𝟔𝟎

26
DEPARTMENT of
STATISTICS

Step 3: Formulate the decision rule.


Since the alternative hypothesis is 𝜇1 < 𝜇2 , therefore the decision rule is
to Reject Ho if 𝒕 < −𝒕𝜶Τ𝟐,𝐧−𝟏 = −𝟏. 𝟖𝟔𝟎.
Step 4: Compute the value of the test statistic.
Student 1 2 3 4 5 6 7 8 9
Before (𝑋1 ) 9 12 6 15 3 18 10 13 7
After (𝑋2 ) 9 17 9 20 2 21 15 22 6
𝑑𝑖 = 𝑋1 − 𝑋2 0 -5 -3 -5 1 -3 -5 -9 1
σ 𝑑𝑖 −28
𝑑ҧ = = = −3.11
𝑛 9 𝑑ҧ − 𝜇𝑑 −3.11 − 0
σ 𝑑𝑖 2 −28 2 𝑡= 𝑠 = = −𝟐. 𝟖𝟎𝟐
2
σ 𝑑𝑖 −
𝑛
176 −
9
𝑑
ൗ 𝑛 3.33ൗ
𝑠𝑑 = = = 3.33
𝑛−1 8 9
27
DEPARTMENT of
STATISTICS

Step 5: Make a decision.


The decision is to reject the null hypothesis, since the test value −2.802 is
less than critical value −1.860.

Acceptance
Rejection Region
Region

−2.802 −1.860 0

Step 6: Draw conclusion


There is enough evidence to support the claim that attending the seminar
increase the number of hours the students studied per week.

28
DEPARTMENT of
STATISTICS

Example: t-test for dependent samples using R


The following are the scores of eight students before and after a review. At
5% level of significance, test if the review is effective.

Student 1 2 3 4 5 6 7 8
Before 77 74 82 73 87 68 66 80
After 72 68 76 68 84 68 61 76

29
DEPARTMENT of
STATISTICS

Before <- c(77, 74, 82, 73, 87, 68, 66, 80)
After <- c(72, 68, 76, 68, 84, 68, 61, 76)
t.test(Before,After, alternative="less",
mu=0, paired = TRUE, conf.level = 0.95)

Ho: The review is not effective


Ha: The review is effective
Since P-value (0.9997) is not less than 0.05 (level of significance), then failed to reject Ho. Therefore, we can conclude
that there is no significant difference in scores of students before and after the review which means that the review is
not effective.

30
DEPARTMENT of
STATISTICS

Wilcoxon signed rank test


• The Wilcoxon signed-rank test is a non-parametric test used to compare
two related groups when the data do not meet the assumptions of a
paired t-test, such as normality. It is often used when the differences
between pairs are not normally distributed.

When to Use the Wilcoxon Signed-Rank Test


1. The data are paired or matched.
2. The differences between pairs are not normally distributed.
3. You want to test for a significant difference in the medians of the
paired data.

31
DEPARTMENT of
STATISTICS

Example 1: A researcher decides to see how Subject Before After


effective a pain medication is. Seven randomly
selected subjects were asked to determine 1 7 5
the severity of their pain by using a scale of 1 2 2 3
to 12, with 1 being very minor and 12 being
very severe. Then each was given the 3 3 4
medication, and after 1 hour, they were asked 4 6 3
to rate the severity of their pain, using the
5 5 1
same scale. Is there enough evidence to
support the claim, at 𝛼 = 0.05, that there is a 6 8 6
difference in the severity of the pain before
7 12 4
and after the pain medication?
32
DEPARTMENT of Critical Values for the Wilcoxon Signed-Rank Test
STATISTICS

Solution:

Step 1: State the hypotheses and identify the


claim.
Ho: There is no difference in the severity of the
pain before and after the pain medication.
Ha. There is a difference in the severity of the
pain before and after the pain medication. (claim)

Step 2: Find the critical value.


Since n=7 and 𝛼 = 0.05 for this two-tailed test,
the critical value is 2

Nonparametric Methods | 33
DEPARTMENT of
STATISTICS

Step 4: Find the test value.


a.) Make a table as shown.
Difference Absolute Rank of Signed
Subject Before After D = Before-After value 𝐃 𝐃 rank
1 7 5 7–5=2 2 3.5 +3.5
2 2 3 2 – 3 = -1 1 1.5 -1.5
3 3 4 3 – 4 = -1 1 1.5 -1.5
4 6 3 6–3=3 3 5 +5
5 5 1 5–1=4 4 6 +6
6 8 6 8–6=2 2 3.5 +3.5
7 12 4 12 – 4 = 8 8 7 +7
Note: Any difference of 0 should be ignored.
34
DEPARTMENT of
STATISTICS

b.) Find the sum of the positive ranks and the sum of the negative ranks
separately.
Positive rank sum: +3.5 + +5 + +6 + +3.5 + +7 = +25
Negative rank sum: −1.5 ± 1.5 = −3
c.) Select the smaller of the absolute values of the sums −3 , and use this
absolute value as the test value. In this case the test value is 3.
Step 4: Make a decision.
Reject the null hypothesis if the test value is less than or equal to the critical
value. In this case, 3 > 2; hence, the decision is to not reject the null
hypothesis.
Step 5: Summarize the results.
There is not enough evidence to support the claim that there is a difference
in the severity of the pain before and after the pain medication.
35
DEPARTMENT of
STATISTICS

Example: Wilcoxon signed rank test using R


Suppose a teacher wants to evaluate whether a new teaching method
improves students' performance. The teacher collects test scores from 8
students before and after applying the new method.
Before 78 85 90 88 76 89 93 82
After 84 88 92 91 80 90 94 85

36
DEPARTMENT of
STATISTICS

> before <- c(78, 85, 90, 88, 76, 89, 93, 82)
> after <- c(84, 88, 92, 91, 80, 90, 94, 85)
> wilcox.test(before, after, paired = TRUE)

Ho: There is no significant difference in student performance before and after the new teaching method.
Ha: There is a significant difference in student performance before and after the new teaching method.
Since P-value (0.01368) is less than 0.05 (level of significance), then reject Ho. Therefore, we can conclude that there is a
significant difference in student performance before and after the new teaching method.

37

You might also like