0% found this document useful (0 votes)

10 views

Lecture Slides - Hypothesis Testing

Uploaded by

smorshed03

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Lecture Slides - Hypothesis Testing

Uploaded by

smorshed03

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Statistics for Data Science

Hypothesis Testing
Agenda - Hypothesis Testing
1. Hypothesis Testing
a. Introduction
b. Hypothesis Formulation
2. Basic concepts of Hypothesis Testing
a. Importance of null
b. Importance of test statistic
c. Type I and Type 2 errors
d. Hypothesis testing template
3. Performing a Hypothesis Test
a. Some key ideas
b. Assumptions
c. Critical point
d. Rejection region approach
e. p-value approach
4. One-Tailed and Two-Tailed Tests
5. Confidence Interval and Hypothesis Test

2
Real World Problem
Suppose you are a quality analyst at a bulb manufacturing company and analyze the
reliability of bulbs. Historically, 70% of the bulbs pass the reliability test.

Now, a slightly altered manufacturing process(B) has been introduced to produce the bulbs.

Can you conclude whether the new process improves the reliability of the bulbs or not by
checking the number of reliable bulbs in a sample?

4
Gathering evidence for statistical Inference
We selected a random sample of 100 bulbs out of which 73 are reliable. Does this provide
strong evidence that the new manufacturing process is more reliable?
If the new manufacturing process was only as good as the current process - What is the
probability of getting 73 or more reliable bulbs in a sample of 100 bulbs?

The probability of getting 73 or more reliable bulbs in a sample of 100 bulbs is ~0.30.

Thus, there is no strong evidence that the new process improves reliability

5
Gathering evidence for statistical Inference
A similar experiment was run with yet another manufacturing process (C). A sample of 100
bulbs produced using this process had 81 reliable bulbs.

The probability of getting 81 or more reliable bulbs in a sample of 100 bulbs is ~0.01.

Thus, there is strong evidence that the new process improves reliability

6
Why Hypothesis?

The problem of estimation is considered, when there is no

previous knowledge of the population parameter. The
Estimation problem is simpler in that case. A random sample is taken,
a sample statistic is computed and an appropriate point
and interval estimate is suggested.

Often the interest is not in the numerical value of the point

estimate of the parameter, but in knowing the plausibility
Hypothesis Testing of a hypothesis about the population parameter by using
sample data. Estimation is not enough to arrive at a
conclusion in such cases.

7
What is Hypothesis?

Often we are interested in population parameter(s)

A hypothesis is a conjecture about the population parameter(s)

For example, a bulb manufacturing company is interested in knowing whether the new
manufacturing process improves reliability of the bulbs.

The objective of the Hypothesis Testing is to SET a value for the parameter(s) and perform
a statistical TEST to see whether that value is tenable in the light of the evidence gathered
from the sample.
8
Overview of Applications
Applications of Hypothesis Testing

Testing the Testing the

Testing Research
validity of a business
Hypotheses
claim decisions

e.g. a new automobile e.g. a manufacturer claims e.g. new online ad has
system increases the mean that 1L soft drink bottles are resulted in higher online
mpg performance filled with an average of at conversion rates for an E-
least 0.99L commerce website

9
Stating the Hypothesis
Null and Alternative Hypotheses - Two
mutually exclusive statements about the
population parameter(s)

Null Hypothesis (H0) Alternative Hypothesis (Ha)

The presumed current The rival opinion
state of the matter or research hypothesis
or status quo. or an improvement target.

E.g. The new process for E.g. The new process for
manufacturing bulbs does manufacturing bulbs
not improve reliability. improves reliability.
10
Null & Alternative Formulation : Example

Mean length of lumber is specified to be 8.5m for a certain building project. A construction
engineer wants to make sure that the shipments she received adhere to that specification.

The population parameter about which the hypothesis will be formed is population mean 𝜇.

The hypotheses are

H0 : 𝜇 = 8.5

Ha : 𝜇 ≠ 8.5

11
Null & Alternative Formulation : Example

There is a belief that 20% of men on business travel abroad brings a significant other with
them. A chain hotel claims that number is too low.

The population parameter about which the hypothesis will be formed is population
proportion 𝜋.

The hypotheses are

H0 : 𝜋 = 0.2

Ha : 𝜋 > 0.2

12
Tips to formulate Null & Alternative

Am I testing an assumption
Am I testing a status quo
or claim that is beyond
that already exists?
what I know?

Null Hypothesis Alternate Hypothesis

Negation of the research Research question to be

question proven

Always contains equality (=, >= ,

<=) Doesn’t contain equality (≠, >, <)

13
Basic Concepts of Hypothesis Testing

14
Importance of Null

Null hypothesis is assumed to be true unless reasonably strong evidence to the contrary is
found.

Based on a random sample a decision is made whether there exists reasonably strong
evidence against the null hypothesis.

Evidence is strong (satisfies the Reject the null hypothesis

predetermined decision rule) in favour of alternative hypothesis

Evidence is not strong (does not satisfy Fail to reject the null hypothesis
the predetermined decision rule) in favour of alternative hypothesis

15
Importance of Test Statistic
The test statistic is calculated from the sample data and tested against the predetermined
Decision Rule.

The test statistic is a random variable that follows a standard distribution such as Normal,
T, F, Chi-square etc. Sometimes the tests are named after the test statistic

Since hypothesis testing is done on the basis of sampling distribution, the decisions made
are probabilistic.

Hence, it is very important to understand the errors associated with hypothesis testing.

16
Type I and Type II Error

17
Type I and Type II Errors

Level of Power of
significance the test
H0 is True H0 is False

Type I Error Correct decision

Reject H0
Prob = α Prob = 1 - β

Fail to reject Correct decision Type II Error

H0 Prob = 1 - α Prob = β

18
Type I and Type II Errors : Example

Null Hypothesis: The patient doesn’t Alternate Hypothesis: The patient

have cancer has cancer

Type I error (false positive): “The patient doesn’t have cancer but doctors says she does”

Type II error (false negative): “The patient does have cancer but report says she doesn’t”

19
Template for Hypothesis Testing

20
Hypothesis Testing Template

1 Identify the key question What is the research question that you are trying to answer?

2 Establish the hypotheses What is the metric of interest? Define the Null and Alternate Hypothesis.

What data do you have? Do you understand what it means? Can it be used
3 Understand and prepare data directly?

4 Identify the right test Choose the method for testing based on the last three points

5 Check the assumptions Ensure that data satisfies the assumption for the test.

6 Perform the test Get to conclusion based on the results (p-value)

21
Performing a hypothesis test

22
Some key ideas first
● Probability of rejecting the null hypothesis when it is
true
Level of
Significance (𝝰) ● Fixed before the hypothesis test.

● Probability of observing test statistic or more extreme

results than the computed test statistic, under the
null hypothesis.
p-value
● Depends on the sample data. Alpha is pre-fixed but
p-value depends on the value of the test statistic

● The total area under the distribution curve of the test

Acceptance or statistic is partitioned into acceptance and rejection
Rejection Region region

● Reject the null hypothesis when the test statistic lies

in the rejection region, Else we fail to reject it
23
Let’s start simple

Consider the following questions in hypothesis testing

What are the null and alternative hypotheses? What is an appropriate test statistic?

How to check whether the data is giving significant

What is preset level of significance?
evidence against the null hypothesis or not?

Let’s see an example and understand the significance of the above questions

For simplicity, we will assume that the population standard deviation is known and the
sample size is more than 30.

24
Example

It is known from experience that for a certain E-commerce company the mean delivery time
of the products is 5 days with a standard deviation of 1.3 days.

The new customer service manager of the company is afraid that the company is slipping
and collects a random sample of 45 orders. The mean delivery time of these samples comes
out to be 5.25 days.

Is there enough statistical evidence for the manager’s apprehension that the mean delivery
time of products is greater than 5 days.

This is clearly a one-tailed test, concerning population mean 𝛍, the

mean delivery time of products.

25
First test - z-test for One Mean

Significance of Test Statistic

Assumptions
the test Distribution
Test for population Standard Normal
mean ● Continuous data distribution
H0 : 𝜇 = 𝜇0 ● Normally distributed
population or sample size > 30
● Known population standard
deviation 𝜎
● Random sampling from the
population

26
One-tailed and Two-tailed Tests

27
One-tailed and Two-tailed Tests
Greater than type
Ha : 𝜇 > 𝜇0

One-tailed test
Less than type
Alternative Ha : 𝜇 < 𝜇0
Hypothesis

Two-tailed test

Not equal type

Ha : 𝜇 ≠ 𝜇0

Choice of One tailed vs Two tailed depends on the nature of the problem, not on the sample data!

28
Difference between One-tailed and Two-tailed Tests

Test statistic value does not change for two-tailed or one-tailed test.

Only the critical value(s) / p-value associated with the test statistic changes

0 1.645 -1.96 0 1.96

The difference is not tested on this

The difference is tested on both the
side and the hypothesis test has
sides.
greater power on the other side
29
Connecting the dots with Confidence
Intervals

30
Confidence Interval vs Hypothesis Testing
Suppose we calculate the (100 - 5)% confidence interval for the mean

We also conduct the Z-test for the mean with a 5% significance level.

The hypotheses of the Z-test are

H0 : 𝜇 = 𝜇0 against Ha : 𝜇 ≠ 𝜇0

Is there any relationship between the estimated confidence interval and the hypothesis
test?

The confidence interval contains all values of 𝜇0 for which the null hypothesis will not be
rejected.
31

Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (27)
Lectureslides Chapter 1
No ratings yet
Lectureslides Chapter 1
65 pages
ENGDATA (10) - Hypothesis Testing
No ratings yet
ENGDATA (10) - Hypothesis Testing
87 pages
C 17
No ratings yet
C 17
20 pages
Quantitative and Statistical Research Methods: From Hypothesis to Results
From Everand
Quantitative and Statistical Research Methods: From Hypothesis to Results
William E. Martin
2/5 (1)
Schrodinger's Atomic Model
No ratings yet
Schrodinger's Atomic Model
4 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
37 pages
Hypothesis Testing Brief
No ratings yet
Hypothesis Testing Brief
47 pages
Formulatinghypotheses 110911135920 Phpapp02
No ratings yet
Formulatinghypotheses 110911135920 Phpapp02
53 pages
Formulating Hypotheses Parametric Tests
No ratings yet
Formulating Hypotheses Parametric Tests
53 pages
testing of hypothesis
No ratings yet
testing of hypothesis
52 pages
BRM-Chapter-10-Hypothesis Testing For Single Populations - Revised
No ratings yet
BRM-Chapter-10-Hypothesis Testing For Single Populations - Revised
28 pages
Statistical Test of Hypotheses
No ratings yet
Statistical Test of Hypotheses
36 pages
04 Statistical Inference v0 1 09062022 090226pm
No ratings yet
04 Statistical Inference v0 1 09062022 090226pm
42 pages
Hypothesis Lecture
No ratings yet
Hypothesis Lecture
7 pages
Hypothesis Testing: Lecture Notes No. 7 M235
No ratings yet
Hypothesis Testing: Lecture Notes No. 7 M235
63 pages
Statistics 102 Hypothesis Testing Reviewer
No ratings yet
Statistics 102 Hypothesis Testing Reviewer
8 pages
BRM UNIT 4
No ratings yet
BRM UNIT 4
20 pages
Chapter 9 - KT110H
No ratings yet
Chapter 9 - KT110H
26 pages
Chapter 5.test of Hyp
No ratings yet
Chapter 5.test of Hyp
47 pages
QA_Hypothesis
No ratings yet
QA_Hypothesis
41 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
47 pages
Hypothesis Testing Part I
No ratings yet
Hypothesis Testing Part I
8 pages
Unit-4 Hypothesis - Testing
No ratings yet
Unit-4 Hypothesis - Testing
17 pages
Business Research Methods: Prof - Radhika Kiran Kumar Indira Institute of Business Management
No ratings yet
Business Research Methods: Prof - Radhika Kiran Kumar Indira Institute of Business Management
41 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
16 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
73 pages
Chapt10 Hypothesis Testing One-Sample Tests BBA
No ratings yet
Chapt10 Hypothesis Testing One-Sample Tests BBA
50 pages
Ch5 Hypothesis Testing
No ratings yet
Ch5 Hypothesis Testing
32 pages
Hypothesis Testing - 7
100% (1)
Hypothesis Testing - 7
102 pages
Testing of Hypothesis
No ratings yet
Testing of Hypothesis
90 pages
Statistics - Week VI
No ratings yet
Statistics - Week VI
48 pages
Session 11 Handouts
No ratings yet
Session 11 Handouts
37 pages
L18 Hypothesis Testing1
No ratings yet
L18 Hypothesis Testing1
62 pages
Unit - 1 L2
No ratings yet
Unit - 1 L2
80 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
45 pages
6Hypothesis-Testing
No ratings yet
6Hypothesis-Testing
37 pages
Module 2 Hypothesis Testing
No ratings yet
Module 2 Hypothesis Testing
49 pages
6.2 Hypothesis Testing v1
No ratings yet
6.2 Hypothesis Testing v1
34 pages
Lect 7 Hypothesis Testing
No ratings yet
Lect 7 Hypothesis Testing
23 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
26 pages
10 InferentialStatistics
No ratings yet
10 InferentialStatistics
124 pages
Testing of Hypothesis
50% (2)
Testing of Hypothesis
47 pages
Mod 5 Hypo1fin
No ratings yet
Mod 5 Hypo1fin
50 pages
Module 3
No ratings yet
Module 3
6 pages
Chapter 5
No ratings yet
Chapter 5
65 pages
Chapter 10
No ratings yet
Chapter 10
48 pages
CH 09
No ratings yet
CH 09
10 pages
BI Lec 6- Hypothesis Testing
No ratings yet
BI Lec 6- Hypothesis Testing
22 pages
Slides On Hypotheses Testing
No ratings yet
Slides On Hypotheses Testing
50 pages
Hypothesis Test
No ratings yet
Hypothesis Test
78 pages
CH 9 (Notes)
100% (2)
CH 9 (Notes)
23 pages
Unit 4-2 Testing of Hypothesis
No ratings yet
Unit 4-2 Testing of Hypothesis
34 pages
A Study Lecture of A Research Methods
No ratings yet
A Study Lecture of A Research Methods
40 pages
Chapter 6
No ratings yet
Chapter 6
20 pages
BADM 221 Unit 8 - Test of Hypothesis
No ratings yet
BADM 221 Unit 8 - Test of Hypothesis
47 pages
Hypothesis Testing Procedure PT 1
No ratings yet
Hypothesis Testing Procedure PT 1
32 pages
Lecture 9
No ratings yet
Lecture 9
23 pages
Testing of Hypothesis Hypothesis
No ratings yet
Testing of Hypothesis Hypothesis
32 pages
Chapter 5
No ratings yet
Chapter 5
35 pages
Powerpoint Topik 8
No ratings yet
Powerpoint Topik 8
6 pages
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Unsupervised Learning
No ratings yet
Unsupervised Learning
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Unsupervised and Supervised Learning Problems: Problem 1
No ratings yet
Unsupervised and Supervised Learning Problems: Problem 1
3 pages
What Is Supervised and Unsupervised Learning?
No ratings yet
What Is Supervised and Unsupervised Learning?
2 pages
Lecture Slides - Inferential Statistics
No ratings yet
Lecture Slides - Inferential Statistics
42 pages
Brodeur et al 2020 (2)
No ratings yet
Brodeur et al 2020 (2)
27 pages
Ether and The Theory of Relativity PDF
0% (1)
Ether and The Theory of Relativity PDF
2 pages
Model Weight (LBS) Price ($)
No ratings yet
Model Weight (LBS) Price ($)
14 pages
Steps of The Scientific Method
No ratings yet
Steps of The Scientific Method
6 pages
Detailed Research Paper
No ratings yet
Detailed Research Paper
8 pages
Operations Management: William J. Stevenson
No ratings yet
Operations Management: William J. Stevenson
19 pages
Lesson Plan in Scientific Method Grade 7
No ratings yet
Lesson Plan in Scientific Method Grade 7
12 pages
Exploring Business FINAL EXAM
No ratings yet
Exploring Business FINAL EXAM
2 pages
PR 2 PPT Week 1
No ratings yet
PR 2 PPT Week 1
20 pages
The West Bengal University of Health Sciences: 4th B.H.M.S. Supplementary Examination, 2018
No ratings yet
The West Bengal University of Health Sciences: 4th B.H.M.S. Supplementary Examination, 2018
27 pages
SAMPLE Midterm Exam #2 Suggested Solutions: Totwrk .166 0.018
No ratings yet
SAMPLE Midterm Exam #2 Suggested Solutions: Totwrk .166 0.018
3 pages
Integration in Natural Science
No ratings yet
Integration in Natural Science
13 pages
Quantum Signatures of BH Mass Superposition
No ratings yet
Quantum Signatures of BH Mass Superposition
7 pages
Practice Research 2
No ratings yet
Practice Research 2
16 pages
A History of Psychology in Ten Questions, 1e Michael Hyland
No ratings yet
A History of Psychology in Ten Questions, 1e Michael Hyland
40 pages
Zinn-Justin J. Quantum Field Theory and Critical Phenomena (3ed., Oxford, 1996) (150dpi) (L) (T) (517s) - PTqs - Cropped (Cut) - Red
No ratings yet
Zinn-Justin J. Quantum Field Theory and Critical Phenomena (3ed., Oxford, 1996) (150dpi) (L) (T) (517s) - PTqs - Cropped (Cut) - Red
1,033 pages
Deductive, Inductive, and Abductive Reasoning and Their Application in Transforming User Needs Into A Solution System
No ratings yet
Deductive, Inductive, and Abductive Reasoning and Their Application in Transforming User Needs Into A Solution System
12 pages
1-research-notes-full-paper
No ratings yet
1-research-notes-full-paper
19 pages
Statistics: Random Variables and Probability Distributions
100% (1)
Statistics: Random Variables and Probability Distributions
41 pages
Lernzettel-Grundlagen Der Empirischen Forschung
No ratings yet
Lernzettel-Grundlagen Der Empirischen Forschung
24 pages
English Project 247
No ratings yet
English Project 247
5 pages
1 Two and Three Dimensional Problems
No ratings yet
1 Two and Three Dimensional Problems
4 pages
Physical Chemistry
100% (1)
Physical Chemistry
7 pages
Lesson 4 Testing of Hypothesis 1
No ratings yet
Lesson 4 Testing of Hypothesis 1
18 pages
Quarter 2 Practical Research 1 Summative Test
No ratings yet
Quarter 2 Practical Research 1 Summative Test
2 pages
Syllabus: Data Analysis 2
No ratings yet
Syllabus: Data Analysis 2
2 pages
Chem 373 - Lecture 3: The Time Dependent Schrödinger Equation
No ratings yet
Chem 373 - Lecture 3: The Time Dependent Schrödinger Equation
29 pages
Variance, Root-Mean Square, Operators, Eigenfunctions, Eigenvalues
No ratings yet
Variance, Root-Mean Square, Operators, Eigenfunctions, Eigenvalues
6 pages

Lecture Slides - Hypothesis Testing

Uploaded by

Lecture Slides - Hypothesis Testing

Uploaded by

Statistics for Data Science

The problem of estimation is considered, when there is no

Often the interest is not in the numerical value of the point

Often we are interested in population parameter(s)

A hypothesis is a conjecture about the population parameter(s)

Testing the Testing the

Null Hypothesis (H0) Alternative Hypothesis (Ha)

The hypotheses are

The hypotheses are

Null Hypothesis Alternate Hypothesis

Negation of the research Research question to be

Always contains equality (=, >= ,

Evidence is strong (satisfies the Reject the null hypothesis

Type I Error Correct decision

Fail to reject Correct decision Type II Error

Null Hypothesis: The patient doesn’t Alternate Hypothesis: The patient

6 Perform the test Get to conclusion based on the results (p-value)

● Probability of observing test statistic or more extreme

● The total area under the distribution curve of the test

● Reject the null hypothesis when the test statistic lies

Consider the following questions in hypothesis testing

How to check whether the data is giving significant

This is clearly a one-tailed test, concerning population mean 𝛍, the

Significance of Test Statistic

Not equal type

0 1.645 -1.96 0 1.96

The difference is not tested on this

The hypotheses of the Z-test are

You might also like