0% found this document useful (0 votes)

13 views

Hypotesis Testing Chapter1

asd

Uploaded by

zopauy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Hypotesis Testing Chapter1

asd

Uploaded by

zopauy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Hypothesis tests and

z-scores
HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
A/B testing
Electronic Arts (EA) is a video game
company.

In 2013, they released SimCity 5.

Their goal was to increase pre-orders of the

game.

They used A/B testing to test different

advertising scenarios.

This involves splitting users into control and

treatment groups.

1 Image credit: "Electronic Arts" by majaX1 CC BY-NC-SA 2.0

HYPOTHESIS TESTING IN R
Retail webpage A/B test
Control Treatment

HYPOTHESIS TESTING IN R
A/B test results
The treatment group (no ad) got 43.4% more purchases than the control group (with ad).
The intuition that "showing an ad would increase sales" was completely wrong.

Was this result statistically significant or just by chance?

You need EA's data to determine this.

You'd use techniques from Sampling in R + this course to do so.

HYPOTHESIS TESTING IN R
Stack Overflow Developer Survey 2020
library(dplyr)
glimpse(stack_overflow)

Rows: 2,261
Columns: 8
$ respondent <dbl> 36, 47, 69, 125, 147, 152, 166, 170, 187, 196, 221,…
$ age_first_code_cut <chr> "adult", "child", "child", "adult", "adult", "adult…
$ converted_comp <dbl> 77556, 74970, 594539, 2000000, 37816, 121980, 48644…
$ job_sat <fct> Slightly satisfied, Very satisfied, Very satisfied,…
$ purple_link <chr> "Hello, old friend", "Hello, old friend", "Hello, o…
$ age_cat <chr> "At least 30", "At least 30", "Under 30", "At least…
$ age <dbl> 34, 53, 25, 41, 28, 30, 28, 26, 43, 23, 24, 35, 37,…
$ hobbyist <chr> "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes", "Ye…

HYPOTHESIS TESTING IN R
Hypothesizing about the mean
A hypothesis:

The mean annual compensation of the population of data scientists is $110,000.

The point estimate (sample statistic):

mean_comp_samp <- mean(stack_overflow$converted_comp)

mean_comp_samp <- stack_overflow %>%

summarize(mean_compensation = mean(converted_comp)) %>%
pull(mean_compensation)

119574.7

HYPOTHESIS TESTING IN R
Generating a bootstrap distribution
# Step 3. Repeat steps 1 & 2 many times
so_boot_distn <- replicate(
n = 5000,
expr = {

# Step 1. Resample
stack_overflow %>%
slice_sample(prop = 1, replace = TRUE) %>%

# Step 2. Calculate point estimate

summarize(mean_compensation = mean(converted_comp)) %>%
pull(mean_compensation)

}
)

1 Bootstrap distributions are taught in Chapter 4 of Sampling in R

HYPOTHESIS TESTING IN R
Visualizing the bootstrap distribution
tibble(resample_mean = so_boot_distn) %>%
ggplot(aes(resample_mean)) +
geom_histogram(binwidth = 1000)

HYPOTHESIS TESTING IN R
Standard error
std_error <- sd(so_boot_distn)

5511.674

HYPOTHESIS TESTING IN R
z-scores
value − mean mean_comp_samp
standardized value =
standard deviation
119574.7
sample stat − hypoth. param. value
z=
standard error mean_comp_hyp <- 110000

$119, 574.7 − $110, 000

z= = 1.737 std_error
$5511.67
5511.674

z_score <- (mean_comp_samp - mean_comp_hyp) / std_error

1.737171

HYPOTHESIS TESTING IN R
Testing the hypothesis
Is 1.737171 a high or low number?
This is the goal of the course!

Hypothesis testing use case:

Determine whether sample statistics are close to or far away from expected (or
"hypothesized" values).

HYPOTHESIS TESTING IN R
Standard normal (z) distribution
Standard normal distribution: the normal
distribution with mean zero, standard
deviation 1.

tibble(x = seq(-4, 4, 0.01)) %>%

ggplot(aes(x)) +
stat_function(fun = dnorm) +
ylab("PDF(x)")

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
p-values
HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
Criminal trials
Two possible true states.
1. Defendant committed the crime.

2. Defendant did not commit the crime.

Two possible verdicts.

1. Guilty.

2. Not guilty.

Initially the defendant is assumed to be not guilty.

If the evidence is "beyond a reasonable doubt" that the defendant committed the crime,
then a "guilty" verdict is given, else a "not guilty" verdict is given.

HYPOTHESIS TESTING IN R
Age of first programming experience
age_first_code_cut classifies when Stack Overflow user first started programming
1. "adult" means they started at 14 or older

2. "child" means they started before 14

Previous research suggests that 35% of software developers started programming as

children

Does our sample provide evidence that data scientists have a greater proportion starting
programming as a child?

HYPOTHESIS TESTING IN R
Definitions
A hypothesis is a statement about an unknown population parameter.

A hypothesis test is a test of two competing hypotheses.

The null hypothesis (H0 ) is the existing "champion" idea.

The alternative hypothesis (HA ) is the new "challenger" idea of the researcher.

For our problem

H0 : The proportion of data scientists starting programming as children is the same as that
of software developers (35%).

HA : The proportion of data scientists starting programming as children is greater than 35%.

1"Naught" is British English for "zero". For historical reasons, "H-naught" is the international convention for
pronouncing the null hypothesis.

HYPOTHESIS TESTING IN R
Two possible true states. In reality, either HA or H0 is true (but not
1. Defendant committed the crime. both).
2. Defendant did not commit the crime. The test ends in either "reject H0 " verdict or

Two possible verdicts. "fail to reject H0 ".

1. Guilty. Initially the null hypothesis, H0 , is assumed

2. Not guilty. to be true.

Initially the defendant is assumed to be not If the evidence from the sample is
"significant" that HA is true, choose that
guilty.
hypothesis, else choose H0 .
If the evidence is "beyond a reasonable
doubt" that the defendant committed the
Significance level is "beyond a reasonable
doubt" for hypothesis testing.
crime, then a "guilty" verdict is given, else a
"not guilty" verdict is given.

HYPOTHESIS TESTING IN R
One-tailed and two-tailed tests
Hypothesis tests determine whether the
sample statistics lie in the tails of the null
distribution.

Test Tails
alternative different from null two-tailed
alternative greater than null right-tailed
alternative less than null left-tailed
HA : The proportion of data scientists starting
programming as children is greater than 35%.

Our alternative hypothesis uses "greater

than," so we need a right-tailed test.

HYPOTHESIS TESTING IN R
p-values
The larger the p-value, the stronger the support for H0 .
The smaller the p-value, the stronger the evidence against H0 .

Small p-values mean the statistic is in the tail of the null distribution (the distribution of the
statistic if the null hypothesis was true).
The "p" in p-value stands for probability.

For p-values, "small" means "close to zero".

HYPOTHESIS TESTING IN R
Defining p-values
A p-value is

the probability of observing a test statistic

as extreme or more extreme

than what was observed in our original sample,

assuming the null hypothesis is true.

HYPOTHESIS TESTING IN R
Calculating the z-score
prop_child_samp <- stack_overflow %>%
summarize(point_estimate = mean(age_first_code_cut == "child")) %>%
pull(point_estimate)

0.388

prop_child_hyp <- 0.35

std_error <- 0.0096028

z_score <- (prop_child_samp - prop_child_hyp) / std_error

3.956

HYPOTHESIS TESTING IN R
Calculating the p-value
pnorm() is normal CDF.

Left-tailed test → use default lower.tail = TRUE .

Right-tailed test → set lower.tail = FALSE .

p_value <- pnorm(z_score, lower.tail = FALSE)

3.818e-05

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R
Statistical
significance
HYPOTHESIS TESTING IN R

Richie Cotton
Data Evangelist at DataCamp
p-value recap
p-values quantify evidence for the null hypothesis.
Large p-value → fail to reject null hypothesis.

Small p-value → reject null hypothesis.

Where is the cutoff point?

HYPOTHESIS TESTING IN R
Significance level
The significance level of a hypothesis test (α) is the threshold point for "beyond a reasonable
doubt".

Common values of α are 0.1 , 0.05 , and 0.01 .

If p ≤ α, reject H0 , else fail to reject H0 .

α should be set prior to conducting the hypothesis test.

HYPOTHESIS TESTING IN R
Calculating the p-value
alpha <- 0.05 p_value <= alpha

prop_child_samp <- stack_overflow %>% TRUE

summarize(
point_estimate = mean(age_first_code_cut == "child")
p_value is less than or equal to alpha , so
) %>%
pull(point_estimate) reject H0 and accept HA .
prop_child_hyp <- 0.35
std_error <- 0.0096028
z_score <- (prop_child_samp - prop_child_hyp) / std_error
The proportion of data scientists starting
programming as children is greater than 35%.
p_value <- pnorm(z_score, lower.tail = FALSE)

3.818e-05

HYPOTHESIS TESTING IN R
Confidence intervals
For a significance level of 0.05, it's common to choose a confidence interval of
1 - 0.05 = 0.95 .

conf_int <- first_code_boot_distn %>%

summarize(
lower = quantile(first_code_child_rate, 0.025),
upper = quantile(first_code_child_rate, 0.975)
)

# A tibble: 1 x 2
lower upper
<dbl> <dbl>
1 0.369 0.407

HYPOTHESIS TESTING IN R
Types of errors
Truly didn't commit crime Truly committed crime
Verdict not guilty correct they got away with it
Verdict guilty wrongful conviction correct

actual H0 actual HA

chosen H0 correct false negative

chosen HA false positive correct

False positives are Type I errors; false negatives are Type II errors.

HYPOTHESIS TESTING IN R
Possible errors in our example
If p ≤ α, we reject H0 :

A false positive (Type I) error could have occurred: we thought that data scientists started
coding as children at a higher rate when in reality they did not.

If p > α, we fail to reject H0 :

A false negative (Type II) error could have occurred: we thought that data scientists coded
as children at the same rate as software engineers when in reality they coded as children at
a higher rate.

HYPOTHESIS TESTING IN R
Let's practice!
HYPOTHESIS TESTING IN R

Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Chapter 1
No ratings yet
Chapter 1
34 pages
1.Hypothesis Testing Fundamentals
No ratings yet
1.Hypothesis Testing Fundamentals
34 pages
Hypothesis Testing in Python
No ratings yet
Hypothesis Testing in Python
149 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Hypothesis Tesing
No ratings yet
Hypothesis Tesing
30 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
Hypothesis Tests in r
No ratings yet
Hypothesis Tests in r
25 pages
Introduction to Statistical Hypothesis Testing in R
No ratings yet
Introduction to Statistical Hypothesis Testing in R
8 pages
Hypothesis (1)
No ratings yet
Hypothesis (1)
44 pages
Hypothesis Testing with z Tests
No ratings yet
Hypothesis Testing with z Tests
32 pages
Intro To Hypothesis Testing
No ratings yet
Intro To Hypothesis Testing
83 pages
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
No ratings yet
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
15 pages
7.hypothesis Test
No ratings yet
7.hypothesis Test
67 pages
Hypothesis Testing Statistics
No ratings yet
Hypothesis Testing Statistics
59 pages
14-UnknownMeans
No ratings yet
14-UnknownMeans
43 pages
Z - TEST and T Test
No ratings yet
Z - TEST and T Test
45 pages
5.convergence Informatics Week5
No ratings yet
5.convergence Informatics Week5
41 pages
Chapter 5
No ratings yet
Chapter 5
35 pages
Lecture 09
No ratings yet
Lecture 09
48 pages
Statistics Analytics Hypothesis Testing Z Test T Test
No ratings yet
Statistics Analytics Hypothesis Testing Z Test T Test
14 pages
webMATH236_Lecture6
No ratings yet
webMATH236_Lecture6
60 pages
Lec3_2
No ratings yet
Lec3_2
9 pages
Math 110 2 Hypothesis Testing
100% (1)
Math 110 2 Hypothesis Testing
74 pages
Labsheet 7_241206_181406
No ratings yet
Labsheet 7_241206_181406
12 pages
7 - Hypothesis Testing (Compatibility Mode) PDF
No ratings yet
7 - Hypothesis Testing (Compatibility Mode) PDF
9 pages
Statistical_Hypothesis_Testing
No ratings yet
Statistical_Hypothesis_Testing
20 pages
Module2 DS Ppt
No ratings yet
Module2 DS Ppt
46 pages
Chapter IX Hypothesis Testing
No ratings yet
Chapter IX Hypothesis Testing
31 pages
Advanced Statistic
No ratings yet
Advanced Statistic
33 pages
1. Testing
No ratings yet
1. Testing
29 pages
Unit 4 Statistical Testing and Modeling in r
No ratings yet
Unit 4 Statistical Testing and Modeling in r
25 pages
Introduction To Hypothesis Test in R
No ratings yet
Introduction To Hypothesis Test in R
103 pages
Statistical Hypothesis Testing - One Way & Two Way
No ratings yet
Statistical Hypothesis Testing - One Way & Two Way
49 pages
hypothesis testing
No ratings yet
hypothesis testing
51 pages
3.1 R Programming for Statistics and Data Science - Course notes - Hypothesis testing
No ratings yet
3.1 R Programming for Statistics and Data Science - Course notes - Hypothesis testing
9 pages
cs447_tool-using-simulation-to-test-a-hypothesis
No ratings yet
cs447_tool-using-simulation-to-test-a-hypothesis
4 pages
Hypothesis Test
No ratings yet
Hypothesis Test
35 pages
Hypotheses Test 1 Handout
No ratings yet
Hypotheses Test 1 Handout
15 pages
Hypothesis Test
No ratings yet
Hypothesis Test
20 pages
Unit 3 (Hypothesis Testing)
No ratings yet
Unit 3 (Hypothesis Testing)
40 pages
Chapter 8(Technical English for Statistics)
No ratings yet
Chapter 8(Technical English for Statistics)
6 pages
Hypothesis Testing Revised
No ratings yet
Hypothesis Testing Revised
22 pages
Lecture III
No ratings yet
Lecture III
52 pages
What is Hypothesis Testing in Statistics Types a…
No ratings yet
What is Hypothesis Testing in Statistics Types a…
2 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Session - 14-15-Hypothesis Testing
No ratings yet
Session - 14-15-Hypothesis Testing
52 pages
Lab 8 - Sampling Techniques 1
No ratings yet
Lab 8 - Sampling Techniques 1
43 pages
Lecture 5 Test of Hypothesis Upload T
No ratings yet
Lecture 5 Test of Hypothesis Upload T
30 pages
Hypotheses Testing
No ratings yet
Hypotheses Testing
25 pages
Unit 4 Part 2
No ratings yet
Unit 4 Part 2
24 pages
Chapter 7 - Statistical Inference
No ratings yet
Chapter 7 - Statistical Inference
62 pages
Lecture_04
No ratings yet
Lecture_04
104 pages
DS306 数据科学面试 - AB Test专题.pptx (With Watermark) (Compressed) -水印
No ratings yet
DS306 数据科学面试 - AB Test专题.pptx (With Watermark) (Compressed) -水印
95 pages
Chapter 2 T Test
No ratings yet
Chapter 2 T Test
42 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
32 pages
hyp
No ratings yet
hyp
19 pages
Ken Black QA ch09
No ratings yet
Ken Black QA ch09
60 pages
Hypothesis Testing: ETF1100 Business Statistics Week 5
No ratings yet
Hypothesis Testing: ETF1100 Business Statistics Week 5
13 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Chapter 4
No ratings yet
Chapter 4
22 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
Cleaning Data3
No ratings yet
Cleaning Data3
41 pages
Numerato Who Says No To Modern Football
No ratings yet
Numerato Who Says No To Modern Football
19 pages
chapter1-1
No ratings yet
chapter1-1
27 pages
Cleaning Data2
No ratings yet
Cleaning Data2
39 pages
Jalabert. Montevideo 1930 Reassessing The Selection of The First World Cup Host
No ratings yet
Jalabert. Montevideo 1930 Reassessing The Selection of The First World Cup Host
14 pages
Giulianotti 1999 Intro
No ratings yet
Giulianotti 1999 Intro
15 pages
Mid-Century Modern Lounge Chair: Instructables
No ratings yet
Mid-Century Modern Lounge Chair: Instructables
24 pages
Theorising The Contemporary Sport Suppor
No ratings yet
Theorising The Contemporary Sport Suppor
310 pages
Distribución Normal Estándar: Tabla Z
No ratings yet
Distribución Normal Estándar: Tabla Z
6 pages
How To Build A Pikler Triangle
100% (1)
How To Build A Pikler Triangle
24 pages
1) One-Sample T-Test
No ratings yet
1) One-Sample T-Test
5 pages
ch08 - Operations Research Solution
No ratings yet
ch08 - Operations Research Solution
39 pages
Anova
No ratings yet
Anova
18 pages
Chapter 3
No ratings yet
Chapter 3
45 pages
Lesson Note 11
No ratings yet
Lesson Note 11
9 pages
TÓM TẮT XSTK
No ratings yet
TÓM TẮT XSTK
39 pages
S1 Distribution
No ratings yet
S1 Distribution
5 pages
Untitled
No ratings yet
Untitled
9 pages
Chapter 3 Random Variables and Probability Distributions
No ratings yet
Chapter 3 Random Variables and Probability Distributions
20 pages
Naïve Bayes: Exercise
No ratings yet
Naïve Bayes: Exercise
4 pages
Westinghouse Method of Rating
No ratings yet
Westinghouse Method of Rating
2 pages
Data Collection Methods
No ratings yet
Data Collection Methods
55 pages
(eBook PDF) Using and Interpreting Statistics: A Practical Text for the Behavioral, Social, and Health Sciences 3rd Editionpdf download
No ratings yet
(eBook PDF) Using and Interpreting Statistics: A Practical Text for the Behavioral, Social, and Health Sciences 3rd Editionpdf download
52 pages
DMRT
No ratings yet
DMRT
4 pages
8099 18501 1 SM
No ratings yet
8099 18501 1 SM
6 pages
Chapter 8 (Solutions)
No ratings yet
Chapter 8 (Solutions)
24 pages
09 Power & Sample Size
No ratings yet
09 Power & Sample Size
16 pages
Chapter # 4.: Section 4.1: Sample Space and Probability
No ratings yet
Chapter # 4.: Section 4.1: Sample Space and Probability
16 pages
Ric BNB Manual For MSC It Part 1, Sem-1
No ratings yet
Ric BNB Manual For MSC It Part 1, Sem-1
53 pages
Modeling Mindsets
No ratings yet
Modeling Mindsets
113 pages
AUI-STA201&203-LECTURE8
No ratings yet
AUI-STA201&203-LECTURE8
16 pages
Bootstrap
No ratings yet
Bootstrap
52 pages
Gardner & Altman (1986) PDF
No ratings yet
Gardner & Altman (1986) PDF
5 pages
Basic Statistical Concepts
No ratings yet
Basic Statistical Concepts
28 pages
Info Theoretic Proof de Finetti
No ratings yet
Info Theoretic Proof de Finetti
5 pages
Unit-I Origin and Development of Statistics: Bihar Animal Sciences University, Patna, Bihar
No ratings yet
Unit-I Origin and Development of Statistics: Bihar Animal Sciences University, Patna, Bihar
8 pages
Template - MATHEMATICS AS A TOOL (Measures of Central Tendency)
100% (1)
Template - MATHEMATICS AS A TOOL (Measures of Central Tendency)
11 pages
What Are Confidence Intervals - Simply Psychology
No ratings yet
What Are Confidence Intervals - Simply Psychology
5 pages
Random Variable: "The Number of Heads When Flipping A Coin"
No ratings yet
Random Variable: "The Number of Heads When Flipping A Coin"
25 pages
Sathyabama University: Register Number
No ratings yet
Sathyabama University: Register Number
4 pages

Hypotesis Testing Chapter1

Uploaded by

Hypotesis Testing Chapter1

Uploaded by

Hypothesis tests and

In 2013, they released SimCity 5.

Their goal was to increase pre-orders of the

They used A/B testing to test different

This involves splitting users into control and

1 Image credit: "Electronic Arts" by majaX1 CC BY-NC-SA 2.0

Was this result statistically significant or just by chance?

You need EA's data to determine this.

You'd use techniques from Sampling in R + this course to do so.

The mean annual compensation of the population of data scientists is $110,000.

The point estimate (sample statistic):

mean_comp_samp <- mean(stack_overflow$converted_comp)

mean_comp_samp <- stack_overflow %>%

# Step 2. Calculate point estimate

1 Bootstrap distributions are taught in Chapter 4 of Sampling in R

$119, 574.7 − $110, 000

z_score <- (mean_comp_samp - mean_comp_hyp) / std_error

Hypothesis testing use case:

tibble(x = seq(-4, 4, 0.01)) %>%

2. Defendant did not commit the crime.

Two possible verdicts.

Initially the defendant is assumed to be not guilty.

2. "child" means they started before 14

Previous research suggests that 35% of software developers started programming as

A hypothesis test is a test of two competing hypotheses.

The null hypothesis (H0 ) is the existing "champion" idea.

For our problem

Two possible verdicts. "fail to reject H0 ".

1. Guilty. Initially the null hypothesis, H0 , is assumed

2. Not guilty. to be true.

Our alternative hypothesis uses "greater

For p-values, "small" means "close to zero".

the probability of observing a test statistic

as extreme or more extreme

than what was observed in our original sample,

assuming the null hypothesis is true.

prop_child_hyp <- 0.35

std_error <- 0.0096028

z_score <- (prop_child_samp - prop_child_hyp) / std_error

Left-tailed test → use default lower.tail = TRUE .

Right-tailed test → set lower.tail = FALSE .

p_value <- pnorm(z_score, lower.tail = FALSE)

Small p-value → reject null hypothesis.

Where is the cutoff point?

Common values of α are 0.1 , 0.05 , and 0.01 .

If p ≤ α, reject H0 , else fail to reject H0 .

prop_child_samp <- stack_overflow %>% TRUE

conf_int <- first_code_boot_distn %>%

chosen H0 correct false negative

chosen HA false positive correct

If p > α, we fail to reject H0 :

You might also like