0% found this document useful (0 votes)
40 views34 pages

Unit 4 Hypothesis Testing (One Sample Mean) (SY22)

Uploaded by

20220025082
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views34 pages

Unit 4 Hypothesis Testing (One Sample Mean) (SY22)

Uploaded by

20220025082
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

UNIT 4

INTRODUCTION TO HYPOTHESIS
TESTING

Mathematics Department
XAVIER UNIVERSITY-ATENEO DE CAGAYAN
INFERENTIAL
STATISTICS
An area in Statistics that deals with
methods used to make generalizations
or inferences about some
characteristics of the population based
on information contained in a sample
Hypothesis Testing
- one type of inferential analysis
- a decision-making process for testing the claims
about a population parameter based on the
characteristics of a sample randomly taken from
the population
Important Terms
■ The null hypothesis, H0, is a statement that specifies a
particular value (or values) for the parameter being studied. It is
the hypothesis that is being tested.
■ The alternative hypothesis, H1, specifies those values of the
parameter that represent an important change from the null
hypothesis. It opposes the null hypothesis.
■ The probability of committing a Type I error is the level of
significance of the test. It is the maximum probability with which
we are willing to commit a Type I error. In practice, usually set at
0.05 (*significant) or 0.01 (** highly significant)
■ The test statistic is a statistic computed from the sample on
which the decision to reject or not to reject H0 is based. If the
computed test statistic falls in the rejection region, the H0 is
rejected.
■ The critical value is the boundary between the rejection region
and the nonrejection region.
The following is the summary of the steps when
performing hypothesis testing.

1. Formulate the null and alternative hypotheses.


2. Specify the level of significance to be used.
3. Select the appropriate test statistic.
4. Establish the rejection region/regions.
5. Compute the actual value of the test statistic from the
sample.
6. Make a statistical decision.
7. Draw the appropriate conclusion.
1. Testing One Population Mean (Z test, t test)

The degrees of freedom relate to the number of observations that are free to vary.
Most computer programs do not have this option because Z test
and t test approximately give similar output when the sample size is
large.
One sample t test is available in SPSS.
Z distribution – based on the t distribution
Standard Normal Distribution
Properties of the t Distribution
Properties of a Normal Distribution
 It is, like the Z distribution, a
■ The normal curve is symmetrical
about the mean μ; bell-shaped, symmetrical,
continuous distribution.
■ In a standard normal distribution,
the mean is zero.  There is not one t distribution
■ The mean is at the middle and but rather a “family” of t
divides the area into halves; distributions. All have the same
mean (center) of zero, but their
■ The total area under the curve is
equal to 1;
standard deviations differ
according to the sample size n.
■ It is completely determined by its
mean and standard deviation σ (or  The t distribution is more spread
variance σ2) out and flatter at the center
than the Z distribution.
 However, as the sample size
increases, the t distribution
approaches the standard
normal distribution.
Decision of the test
■ i. The null hypothesis is rejected when the computed test
statistic falls in the rejection region.
■ ii. The null hypothesis is not rejected when the computed test
statistic falls in the non-rejection region.

Conclusion of the test


■ 1. When the null hypothesis is rejected, then this implies that
the sample data provide sufficient evidence to contradict the
null hypothesis in favor of the alternative hypothesis.
■ 2. When the null hypothesis is not rejected, then the sample
data do not provide sufficient evidence to contradict the null
hypothesis. However, failing to reject the null hypothesis does
not necessarily mean that it is true, only that we do not have
enough to reject it.
p-value
Note: When using statistical software, it gives a p-value.

A p-value is the probability of observing a test statistic value as


extreme as the one computed from the sample data if the null
hypothesis is true.

As a rule of thumb, H0 is rejected if the p-value is less than or


equal to  (level of significance).

A small p-value basically means that your data are unlikely under some null
hypothesis.
Example 1 (two-tailed Z test).
Lighthouse Electrical Company manufactures light bulbs that
have a lifetime that is approximately normally distributed with a
mean of 750 hours. A random sample 30 bulbs has been
tested and showed a mean lifetime of 738 hours. It has been
known that the population standard deviation is 37.5 hours. At
0.05 level of significance, test the hypothesis that the mean
lifetime of light bulbs is significantly different from 750 hours.
Solution:
Let  = mean lifetime of light bulbs in hours

Steps in hypothesis testing


1. Null and alternative hypotheses
H0: The mean lifetime of light bulbs is 750 hours.
H1: The mean lifetime of light bulbs is significantly different from 750
hours.
(In symbols)
H :   750 hours and H1 :   750 hours
0

2. Significance level:   0.05

3. Test Statistic: Since  is known and the distribution


is approximately normally distributed, then the appropriate test
statistic is Z test where
x
Z 

n
4. Rejection Regions: Since it is a two-tailed test based on H 1 :   750
hours, the rejection regions are in both tails of the Z distribution, given by
Z  Z or Z  Z 
2 2

Z   Z 0.025 or Z  Z 0.025
.
Based from the Z distribution table, the critical value of the test is  1.96

Thus, H0 is rejected if Z  1.96 or Z  1.96


otherwise, H0 is not rejected.
5. Computation of the test statistic

x 738  750


Z   1.75
 37.5
n 30

6. Statistical decision: Since Z = -1.75 is not in the rejection region, H0 is


not rejected.
(Note: Z = -1.75 is within the non-rejection region between -1.96 to 1.96)

7. Conclusion
There is no sufficient evidence to claim that the mean lifetime of the light
bulbs is significantly different from 750 hours.
Example 2 (one-tailed t test)
A manager of a hotdog company claims that mean time to
pack one dozen of hotdog manually is 11 seconds. Suppose a
random sample of 20 workers spent a mean of 13 seconds
with a standard deviation of 2.3 seconds to pack a dozen of
hotdog. At 0.01 level of significance, do the sample results
provide sufficient evidence to conclude that it takes more than
11 seconds to pack one dozen of hotdog manually? Assume
that the length of time is normally distributed.
Solution:
Let  = mean length of time to pack one dozen of hotdog manually

Steps in hypothesis testing


1. Null and alternative hypotheses
H0: The mean length of time to pack one dozen of hotdog manually is 11
seconds.
H1: The mean length of time to pack one dozen of hotdog manually is more
than 11 seconds.

(In symbols) : H 1 :   11 seconds

2. Significance level:   0.01


3. Test Statistic: Since  is unknown, the distribution of the length of time is
normally distributed, and the sample size n  20
x  (less than 30), the appropriate test statistic is t
t 
s test.
n .
4. Rejection Region: Since it is a one-tailed test based on H 1 :   11
minutes, the rejection region is in the right tail of the t distribution, given by

t  t with degrees of freedom df  n  1  20  1  19


t  t 0.01

Thus, H0 is rejected if t  2.539 , otherwise, H0 is not rejected.


5. Computation of the test statistic
x   13  11
t    3 . 889
s 2 .3
n 20

t  3 . 889

6. Statistical decision
Since t = 3.889 is in the rejection region, H0 is rejected.

7. Conclusion: There is sufficient evidence to claim that that mean


time to pack one dozen of hotdog manually is more than 11 seconds.
Example: One Sample t test in Excel
You are investigating the mean age of patients in NMMC who
had undergone appendectomy. Test at   0 . 05 if the mean
age is 20 years old given the following ages of a random
sample of 18 respondents:

36 28 21 25 31 17 22 18 18
29 21 26 17 18 30 19 19 28

Ho: µ = 20 yrs old


H1: µ ≠ 20 yrs old

Note: Check for normality of data before performing t-test.


Note: t test for one sample is not available in Excel 2007 but we can trick a
“fool Excel” to do it :))
Steps: Excel Trick
1. Encode the AGE data in a column. Add
a DUMMY data with at least two zeroes.

2. Click Data – Data Analysis – t-test: Two


Samples Assuming Unequal Variances

3. Enter the following:


Variable 1 Range: Highlight Age Data
Variable 2 Range: Highlight Dummy
Hypothesized Mean Difference: 20
Check Labels.
Alpha: 0.05
Output Range: select any cell where
you want to display the output

4. Click OK.

H 0 :   20 yrs.old H 0 :   0  20 yrs.old
Edit the table.
t-Test: One-Sample
t-Test: Two-Sample Assuming Unequal Variances

AGE Dummy AGE


Mean 23.5 0 Mean 23.5
Variance 33.2058824 0 Variance 33.2058824
Observations 18 2 Observations 18
Hypothesized Mean Difference 20 Hypothesized Mean 20
df 17 df 17
t Stat 2.57689537 t Stat 2.57689537
P(T<=t) one-tail 0.00979665 P(T<=t) one-tail 0.00979665
t Critical one-tail 1.73960673 t Critical one-tail 1.73960673
P(T<=t) two-tail 0.0195933 P(T<=t) two-tail 0.0195933
t Critical two-tail 2.10981558 t Critical two-tail 2.10981558
Interpretation: One Sample t-test
Decision Rule (one-tailed test)
1. Based on critical value: Reject Ho in favor of H1 if t stat > t Critical two-tail
2. Based on p-value: Reject Ho in favor of H1 if P(T<=t) two-tail  0.05.
AGE
Mean 23.5 Ho: µ = 20 yrs old
Variance 33.2058824 H1: µ ≠ 20 yrs old
Observations 18
Hypothesized Mean 20
df 17
t Stat 2.57689537
P(T<=t) one-tail 0.00979665
t Critical one-tail 1.73960673
P(T<=t) two-tail 0.0195933
t Critical two-tail 2.10981558

Decision/Conclusion: Reject Ho. The mean age of patients in NMMC who had
undergone appendectomy is significantly different from 20yrs old.
SPSS: Analyze  Compare Means  One Sample t test
Ho: µ = 20 yrs old
H1: µ ≠ 20 yrs old
By default, SPSS provides
only the two-tailed
probability. Since the p-value
is less than 0.05, Ho is
rejected. The mean age of
patients in NMMC who had
undergone appendectomy is
significantly different from
20yrs old.

Furthermore, the fact that


the t-statistic is positive tells
us that the mean age is
significantly greater than 20
years old.
PSPP: Analyze -> Compare
Means -> One Sample
T Test

Jamovi: Analyses -> T-Tests ->


One Sample T-Test

If nonparametric  Wilcoxon Rank


Jamovi: Analyses -> T-Tests -> One Sample T-Test
Assumptions of Parametric Tests
1. Normally distributed data: The rationale behind hypothesis
testing relies on having something that is normally distributed.

2. Homogeneity of variance: In designs in which you test several


groups of participants this assumption means that each of these
samples comes from populations with the same variance.

3. Interval/Ratio data: Data should be measured at least at the


interval level (interval or ratio)

4. Independence: In some cases it means that data from different


participants are independent, which means that the behaviour of
one participant does not influence the behaviour of another.
Test for Normality in SPSS

Analyze  Descriptive Statistics  Explore


 Plots  Normality plots with tests

(Ho: The data are normally distributed.)

Since p-value (Sig.) 0.316 is not less than 0.05, Ho is not rejected and
conclude that the data come from a normal distribution.

Note: In large samples these tests can be significant even when the scores are only
slightly different from a normal distribution. Therefore, they should always be
interpreted in conjunction with histograms, P–P or Q–Q plots, and the values of
skewness and kurtosis.
Jamovi: Analyses ->
Exploration -> Descriptive ->
Statistics -> Shapiro-Wilk

PSPP: Analyze  Descriptive


Statistics  Explore  Plots 
Normality plots with tests
To check for normality using Excel, you may perform skewness and kurtosis
to see if the distribution of data is symmetric and mesokurtic which are two
of the characteristics of a normal distribution.
Example:
Age Age Age
36 skewness 0.60592 kurtosis -0.696233
28 std. error 0.577 std. error 1.155
21 2*std. error (+/-) 1.155 2*std. error (+/-) 2.31
25 Interpretation symmetric Interpretation mesokurtic
31
17
22
18
18
29
21
26
17
18
30
19
19
28
Test for Homogeneity of Variances in SPSS

Analyze  Compare Means  One-way ANOVA


 Options  Homogeneity of Variance test

Example

(Ho: The variances are equal.)

Since the p-value (Sig.) 0.410 is not less than 0.05, Ho is not rejected
and conclude that the variances are equal.

Rule of thumb: In most cases, if the ratio of sample variances is at


most 3, then it is safe to assume that the population variances are
equal.
Jamovi: Analyses (Choose a statistical test)

PSPP: Analyze  Compare Means  One-way ANOVA


 Options  Homogeneity of Variance test

You might also like