0% found this document useful (0 votes)
20 views44 pages

Testing of Hypothesis

The document provides an overview of hypothesis testing, including the definitions of null and alternative hypotheses, the procedures for testing them, and the risks associated with decision-making in this context. It explains Type I and Type II errors, significance levels, and the power of a test, along with examples and practice questions. Additionally, it discusses the one-sample Z-test and its application in determining statistical significance.

Uploaded by

arpitadash024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views44 pages

Testing of Hypothesis

The document provides an overview of hypothesis testing, including the definitions of null and alternative hypotheses, the procedures for testing them, and the risks associated with decision-making in this context. It explains Type I and Type II errors, significance levels, and the power of a test, along with examples and practice questions. Additionally, it discusses the one-sample Z-test and its application in determining statistical significance.

Uploaded by

arpitadash024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Data-Science-08

Hypothesis Testing
Introduction Hypothesis
• Hypothesis testing typically begins with some theory, claim, or
assertion about a particular parameter of a population
• These hypothetical statements are tested for their validity by the
information provided by random samples drawn from their
corresponding populations

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Null Hypothesis
• The hypothesis that the population parameter is equal to the company specification is referred to as the null
hypothesis
• It is a statement of no difference (equality) or no significant or insignificant difference and is denoted by 𝐻0 .
• It is a statement of null or neutral attitude.
• 𝐻0 : The difference in quality of tube- lights among three brands is insignificant,

• 𝐻0 : The performance of new drug is same as that of old drug i.e., the performance of new drug is not better
than old one etc.,

• Usually, a null hypothesis is expressed with “=” sign and sometimes ‘>=‘ or ‘<=‘.
• The null hypothesis is that the filling process is working properly, and therefore the mean fill is the 368-gram
specification
𝐻0 : 𝜇 = 368

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Alternative Hypothesis
• It is a rivalry or complimentary or opposite hypothesis to null
hypothesis and is denoted by 𝐻1 𝑜𝑟 𝐻𝑎 .
• 𝐻1 :The difference in quality of tube- lights among three brands is
significant,

• 𝐻1 : The performance of new drug is better than old one.


• The Alternative hypothesis is that the filling process is working
properly, and therefore the mean fill is the 368-gram specification
𝐻1 : 𝜇 ≠ 368
Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE
Summary
The following key points summarize the null and alternative hypotheses:
➢The null hypothesis, 𝐻0 , represents the status quo or the current belief in a situation.
➢The alternative hypothesis, 𝐻1 , is the opposite of the null hypothesis and represents
a research claim or specific inference you would like to prove.
➢If you reject the null hypothesis, you have statistical proof that the alternative
hypothesis is correct.
➢If you do not reject the null hypothesis, you have failed to prove the alternative
hypothesis. The failure to prove the alternative hypothesis, however, does not mean
that you have proven the null hypothesis.
➢The null hypothesis, 𝐻0 , always refers to a specified value of the population
parameter (such as 𝜇), not a sample statistic (such as 𝑋ത ).
➢The statement of the null hypothesis always contains an equal sign regarding the
specified value of the population parameter
➢The statement of the alternative hypothesis never contains an equal sign regarding
the specified value of the population parameter

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Procedure for Hypothesis testing
SET UP NULL AND ALTERNATIVE
HYPOTHESES: PRACTICE QUESTIONS
• You are the manager of a fast-food restaurant. You want to determine
whether the waiting time to place an order has changed in the past
month from its previous population mean value of 4.5 minutes. State
the null and alternative hypotheses
Answer:
• The null hypothesis is that the population mean has not changed
from its previous value of 4.5 minutes 𝐻0 : 𝜇 = 4.5
• The alternative hypothesis is that the population mean is not 4.5
minutes 𝐻1 : 𝜇 ≠ 4.5

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Example 1
• A new manufacturing method is believed to be better than the
current method
• Alternative Hypothesis
• A new manufacturing method is better
• Null Hypothesis
• The method is no better than the old method

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Example 2
• A new bonus plan, that is developed is an attempt to increase sales
• Alternative Hypothesis
• A new bonus plan increases the sales
• Null Hypothesis
• New bonus plan does not increase the sales

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Example 3
• A new drug is developed with the goal of lowering Cholesterol level
more than the existing drug
• Alternative Hypothesis
• The new drug lowers Cholesterol-level more than the existing drug
• Null Hypothesis
• The new drug doesn’t lower Cholesterol-level more than the existing drug

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


The Critical Value of the Test Statistic
• In the Oxford Cereal Company scenario, the null hypothesis is that the mean
amount of cereal per box in the entire filling process is 368 grams
• You select a sample of boxes from the filling process, weigh each box, and
compute the sample mean
• This statistic is an estimate of the corresponding parameter
ത is likely to
• Even if the null hypothesis is true, the statistic (the sample mean, 𝑋)
differ from the value of the parameter
• For example, if the sample mean is 367.9, you conclude that the population mean
has not changed
• if the sample mean is 320, you conclude that the population mean is not 368
• Determining what is very close and what is very different is arbitrary without
clear definitions

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Risks in Decision Making Using Hypothesis-
Testing Methodology
• A Type I error occurs if you reject the null hypothesis, 𝐻0 , when it is
true and should not be rejected. The probability of a Type I error
occurring is 𝛼.
• A Type II error occurs if you do not reject the null hypothesis, 𝐻0 ,
when it is false and should be rejected. The probability of a Type II
error occurring is 𝛽.

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Regions of Rejection and Nonrejection
➢ The sampling distribution of the test statistic is divided into two regions, a region of rejection
(sometimes called the critical region) and a region of no rejection
➢ if a value of the test statistic falls into this rejection region, you reject the null hypothesis
➢ If the test statistic falls into the region of no rejection, you do not reject the null hypothesis.

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Risk in Decision Making
• The Table illustrates

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Type-I and Type-II Errors
Errors in Hypothesis Testing
P(Type I error)

➢ 𝜶: denotes the probability of making a Type I error

➢ 𝜶 = 𝐏 Rejecting 𝐻0 𝐻0 is true)

P(Type II error)

➢ 𝜷: denotes the probability of making a Type II error

𝛃 = 𝐏 Accepting 𝐻0 𝐻0 is false)

Note:
➢ 𝜶 and 𝛃 are not independent of each other as one increases, the other decreases
➢ When the sample size increases, both to decrease since sampling error is reduced.
➢ In general, we focus on Type I error, but Type II error is also important, particularly when sample size is
small.

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Level of Significance
level of significance(𝜶)
➢control the Type I error by deciding the risk level, 𝛼, that you are willing to
have in rejecting the null hypothesis when it is true
➢select levels of 0.01, 0.05, or 0.10

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


The Confidence Coefficient
• The complement of the probability of a Type I error, (1 - 𝛼),is called
the confidence coefficient
• The confidence coefficient, (1 - 𝛼), is the probability that you will not
reject the null hypothesis, 𝐻0 , when it is true and should not be
rejected. The confidence level of a hypothesis test is (1 - 𝛼) *100%.

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


The 𝛽 𝑅𝑖𝑠𝑘
• The probability of committing a Type II error is denoted by 𝛽
• the probability of making a Type II error depends on the difference
between the hypothesized and actual values of the population
parameter
• if the difference between the hypothesized and actual values of the
population parameter is large, 𝛽 is small
• if the difference between the hypothesized and actual values of the
parameter is small, 𝛽 is large

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


The Power of a Test
• The complement of the probability of a Type II error, (1 - 𝛽), is called
the power of a statistical test.
• The power of a statistical test, (1 - 𝛽), is the probability that you will
reject the null hypothesis when it is false and should be rejected.

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Z TEST OF HYPOTHESIS FOR THE
MEAN ( KNOWN 𝜎) – Two Tail Test
One-sample Z-test
The One-Sample z-test is used when we want to know whether the
difference between the mean of a sample mean and the mean of a
population is large enough to be statistically significant, that is, if it is
unlikely to have occurred by chance.
Standard Deviation Of Sample
• SEM = (standard deviation of the sample) / √(sample size)
• For example, consider a sample of 100 observations taken from a population
with a standard deviation of 10. The standard error of the mean for this sample
can be calculated as follows:
10
• SEM = = 1
100
• This means that the standard deviation of the sample mean is 1, and it provides a
measure of how accurately the sample mean represents the population mean.
The smaller the standard error of the mean, the more accurate the sample mean
is as an estimate of the population mean.
• In this case, the standard deviation of the population is 10 and the sample size is
100, which gives us a SEM of 1. This means that, on average, the sample mean
will deviate from the population mean by about 1.
Example
• Assume the following
• 𝜇 = 368, 𝑋ത = 372.5, 𝜎 = 15, 𝑛 = 25
• Level of significance 𝛼 = 0.05
• 𝐻0 : 𝜇 = 368
• 𝐻1 : 𝜇 ≠ 368
• Decision rule
• reject 𝐻0 , if Z > 1.96 or if Z <-1.96
• Otherwise donot reject 𝐻0

𝑋−𝜇
• 𝑍= 𝜎 = 1.50
√𝑛
• Because the test statistic Z = +1.50 is between 1.96 and +1.96, you do not reject 𝐻0
• To take into account the possibility of a Type II error, you state the conclusion as
there is insufficient evidence that the mean fill is different from 368 grams.

Dr. K. Adi Narayana Reddy, Assoc.Prof, DSAI, IFHE


Problem:
Given that
Sample size(n) = 81
Mean life time of Glucometer in the sample 𝑥ҧ = 58
Mean life time of Glucometer in the Population μ =60
Standard deviation in the population σ =10
σ 10
Standard error of the Mean(SEM) = =
𝑛 81
=1.11

𝑥ഥ −μ
Test statistics 𝑍=
𝑆𝐸𝑀

58−60
=
1.11
z = 1.80

Critical value of Z at 5% level of significance ZCritical = 1.96

Conclusion: Here we observe that the test statistic value of Z (1.11) is less than Critical value of Z(1.96). So we reject
Alternative hypothesis at 5% level of significance value. And we conclude that there is no sufficient evidence to support that
the mean life time Glucometer is 60
Significance Level
Critical Z Values
One tail test: Example
Step:01 – Set up of Hypothesis
Z score examples using standard
deviation

𝑥−𝑀𝑒𝑎𝑛 300−200
P(X > 300) =P >
𝑆𝐷 75
=P(Z > 1.33)
=1-P(Z < 1.33) Using Z table
= 1-0.9082
=0.0917
Set up of Hypothesis: Trick
Practice Question:01
Practice Question_02:
Practice Question_03
• Example 3: A teacher claims that the mean score of students in his
class is greater than 82 with a standard deviation of 20. If a sample of
81 students was selected with a mean score of 90 then check if there
is enough evidence to support this claim at a 0.05 significance level.
Practice Question:04

You might also like