0% found this document useful (0 votes)

6 views

Confidence Intervals

Confidence Intervals - Stat 20 Berkeley

Uploaded by

riaankumarjha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Confidence Intervals

Confidence Intervals - Stat 20 Berkeley

Uploaded by

riaankumarjha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Confidence Intervals

Quantifying the sampling variability of a statistic.

The process of generalizing from a statistic of a sample to a

parameter of a population is known as statistical inference. The
parameter of interest could be a mean, median, proportion,
correlation coeﬀicient, the coeﬀicient of a linear model . . . the
list goes on. In the scenario that unfolded in Pimentel Hall,
the parameter was the mean year of the 527 students in the
class. The process of estimating that parameter by calculating
the sample mean of the 18 students who decided to sit in the
front row that day induces a sampling distribution.

Sampling Distribution

1 2 3 4
x (mean year)

This sampling distribution captures the two sources of error

that creep in while generalizing. The horizontal offset from the
true population parameter (the red line) to the mean of the
sampling distribution (the gold triangle) represents the bias.
The spread of the sampling distribution represents the variation.
In these lecture notes you’ll learn how to quantify sampling
variability using two common tools.

1
Standard Error (SE) The standard deviation of the sampling
distribution of a statistic.
Confidence Interval An interval of two values that represent
lower and upper bounds on the statistic that captures
most of the sampling distribution.

To focus on the variation, let’s introduce a second example, one

in which we will not need to worry about bias.

A Simple Random Sample

Restaurants in San Francisco

Every year, the city of San Francisco’s health department visits
all the restaurants in the city and inspects them for food safety.
Each restaurant is given an inspection score; these range from
100 (perfectly clean) to 48 (serious potential for contamination).
We have these scores from 2016. Let’s build up to the sampling
distribution bit by bit.

The Population Distribution

Our population consists of the restaurants in San Francisco.
Since the data are published online for all restaurants, we have
a census1 of scores for every restaurant in the city.

Population Distribution

0.075
Proportion

0.050

0.025

0.000
60 80 100
Food Safety Scores
1
The terms census refers to a setting where you have access to the entire
population.

2
The population distribution is skewed left with a long left tail.
The highest possible score is 100. It appears that even scores
are more popular than odd scores for scores in the 90s; in fact
there are no scores of 99, 97, and 95.
We can calculate two parameters of this population:
Population parameters, like the
parameters of probability
• The population mean, 𝜇, is 87.6. distributions, are usually given a
• The population SD, 𝜎, is 8.9. Greek letter. The population mean
is 𝜇, said “myoo”, and the
population standard deviation is 𝜎,
said “sigma”.
The Empirical Distribution
Although we have data on all of the restaurants in the city,
imagine that you’re an inspector who has visited a simple
random sample of 100 restaurants. That is, you draw 100
times without replacement from the population, with each unit
equally likely to be selected. This leads to a representative
sample that will have no selection bias.
The distribution of this sample (an empirical distribution) looks
like:

Empirical Dist. (Sample 1)

9
Proportion

0
60 80 100
Food Safety Scores

The sample statistics here are:

While parameters are symbolized
with Greek letters, statistics are
• The sample mean, 𝑥,̄ is 86.27. usually symbolized with Latin
• The sample SD, 𝑠, is 9.9. letters.

3
Observe that the empirical distribution resembles the popula-
tion distribution because we are using a sampling method with-
out with selection bias. It’s not a perfect match but the shape
is similar. The sample average (𝑥)̄ and the sample SD (𝑠) are
also close to but not the same as the population average (𝜇)
and SD (𝜎).

The Sampling Distribution

If you compared your sample to that of another inspector who
visited 100 restaurants, their sample would not be identical to
yours, but it would still resemble the population distribution,
and its 𝑥̄ and 𝑠 would be close to those of all the restaurants
in the city.
The distribution of the possible values of the 𝑥̄ of a simple
random sample of 100 restaurants is the sampling distribution
of the mean (of the sample). We can use it to, for example,
find the chance that the sample mean will be over 88, or the
chance that the sample mean will be between 85 and 95.
Ordinarily this distribution takes some work to create, but in
this thought-experiment have have access to the full population,
so we can simply use the computer to simulate the process. We
repeat 100,000 times the process of drawing a simple random
sample of 100 restaurants. The full distribution looks like:

Sampling Distribution
0.5

0.4
Proportion

0.3

0.2

0.1

0.0
85.0 87.5 90.0
Average Food Safety Scores

4
We can consider numerical summaries of this distribution:

• The mean of the sampling distribution is 87.6.

• The SD of the sampling distribution, which is called the
Standard Error (SE), is 0.9. This convention of using a
different name for the SD for the distribution of a statistic
helps keep straight which kind of standard deviation we’re
talking about.

Observe that the sampling distribution of 𝑥̄ doesn’t look any-

thing like the population or sample. Instead, it’s roughly sym-
metric in shape with a center that matches 𝜇, and a small SE.
The small size of the SE reflects the fact that the 𝑥̄ tends to be
quite close to 𝜇.
Again, the sampling distribution provides the distribution for
the possible values of 𝑥.̄ From this distribution, we find that the
chance 𝑥̄ is over 88 is about 0.33, and the chance 𝑥̄ is between
85 and 95 is roughly, 1.

Putting the Three Panels Together

Let’s look at these three aspects of this process side-by-side.
Population Distribution Empirical Dist. (Sample 1) Sampling Distribution
0.5

0.075 9 0.4
Proportion

Proportion

0.3
0.050 6
0.2

0.025 3
0.1

0.000 0 0.0
60 80 100 60 80 100 85.0 87.5 90.0
Food Safety Scores Food Safety Scores Average Food Safety Scores

Population Empirical Sampling

Shape left skew left skew bell-shaped / normal
Mean 𝜇 = 87.6 𝑥̄ = 86.27 87.6
SD 𝜎 = 8.9 𝑠 = 9.9 0.89

Observe that:

1. 𝜇 and the mean of the sampling distribution are roughly

the same.

5
2. 𝜎 and the SE of the sample averages are related in the
following way2 :

𝜎
𝑆𝐸(𝑥)̄ ≈ √
𝑛

3. The histogram of the sample averages is not skewed like

the histogram of the population, on the contrary, it is
symmetric and bell-shaped, like the normal curve.
4. The histogram of our sample of 100 resembles the popu-
lation histogram.
5. Since 100 is a pretty large sample,

𝜇 ≈ 𝑥̄
𝜎≈𝑠

Up until this point, we’ve worked through this thought exper-

iment with the unrealistic assumption that we know the popu-
lation. Now we’re ready to make inferences in a setting where
we don’t know the population.

2
This approximation becomes equality for a random sample with re-
placement. When we have a SRS, the exact formula is 𝑆𝐸(𝑥)̄ =
√
√ 𝑁−𝑛
𝑁−1
𝜎/ 𝑛.
This additional term, called the finite population correction factor,
adjusts for the fact that we are drawing without replacement. Here 𝑁
is the number of tickets in the box (the size of the population) and 𝑛
is the number of tickets drawn from the box (the size of the sample).
To help make sense of this correction factor, think about the following
two cases:

• Draw 𝑁 tickets from the box (that is, 𝑛 = 𝑁).

• Draw only one ticket from the box.

What happens to the SE in these two extreme cases?

In the first case, you will always see the entire population if you are
drawing without replacement. So, the sample mean will exactly match
the population mean. The sampling distribution has no variation, so
𝑆𝐸 = 0.
In the second case, since you take only one draw from the box, it
doesn’t matter if you replace it or not. So the SE for a SRS should
match the SE when sampling with replacement in this special case. In
settings when 𝑁 is large relative to 𝑛, it effectively behaves as if you
are sampling with replacement.

6
Inference for a Population Average

Drawing on our understanding of the thought-experiment, we

ask:
What happens when you don’t see the population, you just have
your sample, and you want to make an inference about the
population?
We have serious gaps in our procedure for learning about the
sampling distribution!

To start, we know we can use the sample average, 𝑥,̄ to infer

the population average, 𝜇. This is called a point estimate for
the population parameter.
But can we do better than that? Can we bring in more of the
information that we have learned from the thought-experiment?
For example, can we accompany our point estimate with a sense
of its accuracy? Ideally, this would be the SE of the sample
mean. Unfortunately, we don’t know the SE because it depends
on 𝜎. So now what do we do?

Standard Error

The thought-experiment tells us that 𝑠 is close to the 𝜎 (when

you have a SRS). So we can substitute the 𝑠 into the formula
for the SE.

7
𝑠
𝑆𝐸(𝑥)̄ ≈ √
𝑛

When presenting our findings, you might say, that based on a

SRS of 100 restaurants in San Francisco, the average food safety
score is estimated to be 86 with a standard error of about 1.
Suppose someone took a sample of 25 restaurants and provided
an estimate of the average food safety score. Is that only 1/4
as accurate because the sample is 1/4 the size of ours?
Suppose someone took a sample of 100 restaurants in New York
City where there are 50,000 restaurants (this is a made up
number). Is their estimate only 1/10 as accurate because the
number of units in the population is 10 times yours?
We can use the formula for the SE to answer these questions.
In the table below, we have calculated SEs for a generic value of
𝜎 and various choices of the population size and sample size.

Population Size (𝑁 ) Sample Size (𝑛)

25 100 400
500 𝑆𝐸 = 𝜎/5 𝑆𝐸 = 𝜎/10 𝑆𝐸 = 𝜎/20
5,000 𝑆𝐸 = 𝜎/5 𝑆𝐸 = 𝜎/10 𝑆𝐸 = 𝜎/20
50,000 𝑆𝐸 = 𝜎/5 𝑆𝐸 = 𝜎/10 𝑆𝐸 = 𝜎/20

What do you notice about the relationship between sample size

and population size and SE?

• The absolute size of the population doesn’t enter into the

accuracy of the estimate, as long as the sample size is
small relative to the population.
• A sample of 400 is twice as accurate as a sample of 100,
which in turn is twice as accurate as a sample of 25 (as-
suming the population is relatively much larger than the
sample). The precision of estimating the population mean
improves according to the square root of the sample size.

8
Confidence Intervals

Confidence intervals bring in more information from the

thought-experiment. The confidence interval provides an
interval estimate, instead of a point estimate, that is based on
the spread of the sampling distribution of the statistic.
We have seen that the sampling distribution takes a famil-
iar shape: that of the normal curve (also called the bell
curve)3 . Therefore we can fill in some of the holes in the
thought-experiment with approximations.

This is the Central Limit Theorem in action. The CLT states

that sums of random variables become normally distributed as
𝑛 increases. Conveniently enough, most useful statistics are
some version of a sum: 𝑥̄ is a sum divided by 𝑛 and 𝑝̂ is a sum
of variables that take values 0 or 1, divided by 𝑛. This powerful
mathematical result enables one of the most popular methods
of constructing confidence intervals.

Normal Confidence Intervals

When the sampling distribution is roughly normal in shape,
then we can construct an interval that expresses exactly how
much sampling variability there is. Using our single sample
of data and the properties of the normal distribution, we can

3
This is not always the case. We’ll come back to this point later.

9
be 95% confident that the population parameter is within the
following interval.

[𝑥̄ − 1.96𝑆𝐸, 𝑥̄ + 1.96𝑆𝐸] The number 1.96 doesn’t come out

of thin air. Refer to the notes on the
Normal Distribution to understand
So for a sample where the sample mean is 86 and the 95% the origins.
confidence interval is [84.3, 88.2 ], you would say,

I am 95% confident that the population mean is

between 84.3 and 88.2.

For the particular interval that you have created, you don’t
know if it contains the population mean or not. This is why
we use the term confidence to describe it instead of probability.
Probability comes into play when taking the sample, after that
our confidence interval is a known observed value with nothing
left to chance.

Confidence not Probability

To be more precise about what is meant by “confidence”, let’s
take 100 samples of size 25 from the restaurant scores, and
calculate a 95% confidence interval for each of our 100 samples.
How many of these 100 confidence intervals do you think will
include the population mean?
Let’s simulate it! At the bottom of the plot below, the horizon-
tal line at the 𝑦 = 1 indicates the coverage of the confidence
interval from the first sample. It stretches from roughly 84 to
91. The line above it at 𝑦 = 2 indicates the coverage of the
confidence interval that resulted from the second sample, from
roughly 85 to 92.5. Both of these confidence intervals happened
to cover the true population parameter, indicated by the black
vertical line.

10
100

75
Iteration

80 85 90 95
x

As we look up the graph through the remaining intervals, we

see that 95 of the 100 confidence intervals cover the population
parameter. This is by design. If we simulate another 100 times,
we may get a different number, but it is likely to be close to
95.

Inference for a Population Proportion

To gain practice with making confidence intervals, we turn to

another example. This time we sample from a population where
the values are 0s and 1s. You will see that the process is very
much the same, although there are a few simplifications that
arise due to the nature of the population.

11
Suppose we only want to eat at restaurants with food safety
scores above 95. Let’s make a confidence interval for the pro-
portion of restaurants in San Francisco with scores that are
“excellent” (scores over 95). To tackle this problem, we can
modify our population. Since we need only to keep track of
whether a score is excellent, we can replace the scores on the
tickets with 0s and 1s, where 1 indicates an excellent score. Of
the 5766 restaurants in San Francisco, 1240 are excellent. We
can think of our population as a box with 5766 tickets in it,
and 1240 are marked 1, and 4526 are marked 0. This time let’s
take a SRS of 25.
The thought-experiment appears as
Population Empirical Distribution Sampling Distribution
0.8 5
0.6
4
0.6
Proportion

Proportion

0.4 3
0.4
2
0.2
0.2
1

0.0 0.0 0
−0.5 0.0 0.5 1.0 1.5 −0.5 0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6
Excellent score? Excellent scores? Proportion excellent

Population Empirical Sampling

Shape left skew left skew bell-shaped / normal
Mean 𝑝 = 0.22 𝑝̂ = 0.36 0.22
SD 𝜎 = √𝑝(1 − 𝑝) = 0.41 𝑠 = 0.49 0.08

In the special case of a 0-1 box:

• The population average is the proportion of 1s in the box,

let’s call this parameter 𝑝.
• The taking a draw from the population distribution tak-
ing a draw from a Bernoulli random variable, so 𝜎 =
√𝑝(1 − 𝑝).
• The sampling distribution has mean 𝑝.
• The sampling proportion, 𝑝,̂ is similar to 𝑝.
• The SE of the sample proportion4 is approximately
4
This calculation results from casting the total number of 1’s in a sample
of size 𝑛 as a binomial random variable with success probability 𝑝. Call
that random variable 𝑌 . The variance of a binomial random variable
is 𝑉 𝑎𝑟(𝑌 ) = 𝑛𝑝(1 − 𝑝). Observing that sample proportion can be

12
√𝑝(1−
̂ 𝑝)̂
𝑆𝐸(𝑝)̂ = √
𝑛
.

With an equation to estimate 𝑆𝐸 from our data in hand, we

can form a 95% confidence interval.

√𝑝(1
̂ − 𝑝)̂ √𝑝(1
̂ − 𝑝)̂
[𝑝̂ − 1.96 √ , 𝑝̂ + 1.96 √ ]
𝑛 𝑛

Summary

In these notes, we have restricted ourselves to the simple ran-

dom sample, where the only source of error that we’re con-
cerned with is sampling variability. We outlined two tools for
estimating that variability: the standard error (SE) and the
confidence interval.
We saw how the size of the sample impacts the standard error
of the estimate. The larger the sample, the more accurate our
estimates are and in particular the accuracy improves according
√
to 1/ 𝑛. We also found that the size of the population doesn’t
impact the accuracy, as long as the sample is small compared
to the population.
We made confidence intervals for population averages and pro-
portions using the normal distribution. This approach can be
extended to other properties of a population, such as the me-
dian of a population, or the coeﬀicient in a regression equa-
tion.
considered a binomial count divided by 𝑛, and applying the properties
of variance, we can find the variance of 𝑝̂ as,

1
𝑉 𝑎𝑟(𝑝)̂ = 𝑉 𝑎𝑟( 𝑛 𝑌) (1)
= 𝑛12 𝑉 𝑎𝑟(𝑌 ) (2)
= 𝑛12 𝑛𝑝(1 − 𝑝) (3)
= 𝑝(1−𝑝)
𝑛
(4)

So the standard error can be calculated as:

𝑆𝐸(𝑝)̂ = √𝑉 𝑎𝑟(𝑝)̂ = √ 𝑝(1−𝑝)

𝑛
(5)

When estimating the SE from data, we plug in 𝑝̂ for 𝑝.

13
The confidence intervals that we have made are approximate in
the following sense:

• We’re approximating the shape of the unknown sampling

distribution with the normal curve.
• The SD of the sample is used in place of the SD of the
population in calculating the SE of the statistic.

There are times when we are unwilling to make the assumption

of normality. This is the topic of the next set of notes.

Anesthesia Record 5-2
No ratings yet
Anesthesia Record 5-2
1 page
House Prices Prediction in King County
No ratings yet
House Prices Prediction in King County
10 pages
CBC - Agroentrep NC II
100% (2)
CBC - Agroentrep NC II
84 pages
Cogs 14B JANUARY 26, 2017
No ratings yet
Cogs 14B JANUARY 26, 2017
38 pages
Chapter 6
No ratings yet
Chapter 6
7 pages
Finals Math14
No ratings yet
Finals Math14
29 pages
UNIT - 4
No ratings yet
UNIT - 4
10 pages
2_analyze - Inferential Statistics
No ratings yet
2_analyze - Inferential Statistics
27 pages
SEM, Sampling Distribution, Central Limit Theorem
No ratings yet
SEM, Sampling Distribution, Central Limit Theorem
33 pages
Sampling Distribution and P G Estimation: T I3 Topic 3
No ratings yet
Sampling Distribution and P G Estimation: T I3 Topic 3
46 pages
Concept of Sampling Distribution
No ratings yet
Concept of Sampling Distribution
21 pages
QM1 - Lecture 7
No ratings yet
QM1 - Lecture 7
10 pages
ML Unit2 SimpleLinearRegression pdf-60-97
No ratings yet
ML Unit2 SimpleLinearRegression pdf-60-97
38 pages
Slideset 2
No ratings yet
Slideset 2
63 pages
I P S F E Sampling Distributions: Ntroduction To Robability AND Tatistics Ourteenth Dition
No ratings yet
I P S F E Sampling Distributions: Ntroduction To Robability AND Tatistics Ourteenth Dition
37 pages
CH7 - Sampling and Sampling Distributions
No ratings yet
CH7 - Sampling and Sampling Distributions
37 pages
Chapter 8 - Sampling Distribution
No ratings yet
Chapter 8 - Sampling Distribution
34 pages
The Sampling Distribution
No ratings yet
The Sampling Distribution
4 pages
4 Sampling-Distributions
No ratings yet
4 Sampling-Distributions
22 pages
Pre FinalExam Reviewer
No ratings yet
Pre FinalExam Reviewer
4 pages
Stat Notes
No ratings yet
Stat Notes
5 pages
Introduction To Probability and Statistics Twelfth Edition: Presentation Designed and Written By: Barbara M. Beaver
No ratings yet
Introduction To Probability and Statistics Twelfth Edition: Presentation Designed and Written By: Barbara M. Beaver
31 pages
QBM101_chapter7
No ratings yet
QBM101_chapter7
48 pages
chapter7-Sampling-Distribution
No ratings yet
chapter7-Sampling-Distribution
37 pages
Selvanathan 7e - 09
No ratings yet
Selvanathan 7e - 09
46 pages
The Practice of Statistic For Business and Economics Is An Introductory
No ratings yet
The Practice of Statistic For Business and Economics Is An Introductory
15 pages
Sample and Sampling Procedure: Population
No ratings yet
Sample and Sampling Procedure: Population
21 pages
- Module 4-Sampling 2
No ratings yet
- Module 4-Sampling 2
56 pages
MIT2 854F10 Stats
No ratings yet
MIT2 854F10 Stats
38 pages
SAMPLING by Naresh Vasant Afre 13.04.23 Shareable
No ratings yet
SAMPLING by Naresh Vasant Afre 13.04.23 Shareable
58 pages
Sampling Inference
No ratings yet
Sampling Inference
83 pages
Chap1SamplingDistributions
No ratings yet
Chap1SamplingDistributions
14 pages
Gsbiju MA202 3 1
No ratings yet
Gsbiju MA202 3 1
5 pages
Sampling Sta414
No ratings yet
Sampling Sta414
36 pages
06 Stat Est
No ratings yet
06 Stat Est
41 pages
Confidence Intervals and Hypothesis Tests For Means
No ratings yet
Confidence Intervals and Hypothesis Tests For Means
40 pages
Chapter 5 PDF
No ratings yet
Chapter 5 PDF
30 pages
AdHStat1 3notes
No ratings yet
AdHStat1 3notes
10 pages
Chapter 6
No ratings yet
Chapter 6
9 pages
Foundations of Statistical Inference
No ratings yet
Foundations of Statistical Inference
22 pages
Sampling Distribution
No ratings yet
Sampling Distribution
102 pages
Week 11: Sampling Distribution
No ratings yet
Week 11: Sampling Distribution
9 pages
Stats 201 Midterm Sheet
No ratings yet
Stats 201 Midterm Sheet
2 pages
Stats-And-Prob-Reviewer (Grade 11 Stem)
100% (1)
Stats-And-Prob-Reviewer (Grade 11 Stem)
5 pages
And Estimation Sampling Distributions: Learning Outcomes
No ratings yet
And Estimation Sampling Distributions: Learning Outcomes
12 pages
And Estimation Sampling Distributions: Learning Outcomes
No ratings yet
And Estimation Sampling Distributions: Learning Outcomes
12 pages
3 - 2 - Unit 3, Part 1 - (1) Sampling Variability and CLT (21-00)
No ratings yet
3 - 2 - Unit 3, Part 1 - (1) Sampling Variability and CLT (21-00)
11 pages
6Sampling Distribution
No ratings yet
6Sampling Distribution
82 pages
Statistics Group 1
No ratings yet
Statistics Group 1
59 pages
Class 12th Statistics FBISE/Punjab Definitions Formulas
No ratings yet
Class 12th Statistics FBISE/Punjab Definitions Formulas
27 pages
Bsa Unit 5
No ratings yet
Bsa Unit 5
16 pages
Notes ch3 Sampling Distributions
No ratings yet
Notes ch3 Sampling Distributions
20 pages
Bizstat ssn2
No ratings yet
Bizstat ssn2
55 pages
Math Presentation Chapter 13 (Koushik)
No ratings yet
Math Presentation Chapter 13 (Koushik)
37 pages
Introduction To Inferential Statistics Sampling Distributions
No ratings yet
Introduction To Inferential Statistics Sampling Distributions
21 pages
Lec4 Inferential - Stats - Sampling - Distribution - Correct
No ratings yet
Lec4 Inferential - Stats - Sampling - Distribution - Correct
28 pages
Sampling technique and sampling distribution
No ratings yet
Sampling technique and sampling distribution
47 pages
Introduction To Inferential Statistics
No ratings yet
Introduction To Inferential Statistics
27 pages
UNL STAT318 Notes Chapter 1-4 (2020)
No ratings yet
UNL STAT318 Notes Chapter 1-4 (2020)
66 pages
DOM105 2024 Session 12
No ratings yet
DOM105 2024 Session 12
20 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
From Everand
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
S. Deviant
4.5/5 (6)
Cranial Nerves Torres
No ratings yet
Cranial Nerves Torres
37 pages
Cycle (6) : Technology & Communication: STEM Schools in Egypt - English Language Department G.10
No ratings yet
Cycle (6) : Technology & Communication: STEM Schools in Egypt - English Language Department G.10
5 pages
Author' S Style: Wash Publishing Co. 2009
No ratings yet
Author' S Style: Wash Publishing Co. 2009
17 pages
DMSCO Log Book Vol.3 7/1925-6/1926
No ratings yet
DMSCO Log Book Vol.3 7/1925-6/1926
109 pages
Approaches in Teaching Social Studie1
No ratings yet
Approaches in Teaching Social Studie1
77 pages
Unit 13 Lesson 4 Managing Tasks Using Online
No ratings yet
Unit 13 Lesson 4 Managing Tasks Using Online
16 pages
The Anthropocene and Geographies of Geopower
No ratings yet
The Anthropocene and Geographies of Geopower
14 pages
Masculinity Spacecat
No ratings yet
Masculinity Spacecat
3 pages
The Voice of The Rain
No ratings yet
The Voice of The Rain
5 pages
Copycaller Software V2.0: User Guide
No ratings yet
Copycaller Software V2.0: User Guide
92 pages
1 Corinthians 1:10-17
100% (1)
1 Corinthians 1:10-17
5 pages
Resource 1 Pre Colonial Institutions
No ratings yet
Resource 1 Pre Colonial Institutions
10 pages
How To Read/write Your Bosch EDC15 ECU Using The MPPS Tool: MPPS Programmer Guide HDI Tuning LTD WWW - Hdi-Tuning - Co.uk
No ratings yet
How To Read/write Your Bosch EDC15 ECU Using The MPPS Tool: MPPS Programmer Guide HDI Tuning LTD WWW - Hdi-Tuning - Co.uk
11 pages
2021 2 1ji P1kyu Script
No ratings yet
2021 2 1ji P1kyu Script
8 pages
Comparative Politics-Meaning and Scope
No ratings yet
Comparative Politics-Meaning and Scope
13 pages
Philippine Evolution of Education
No ratings yet
Philippine Evolution of Education
3 pages
Ogayon V People
No ratings yet
Ogayon V People
1 page
Essentials of medical microbiology 2nd Edition by Apurba Sastry, Sandhya Bhat ISBN 9789352704798 9352704797 download
No ratings yet
Essentials of medical microbiology 2nd Edition by Apurba Sastry, Sandhya Bhat ISBN 9789352704798 9352704797 download
32 pages
Atividades Avaliativas 2 Bimestre
100% (1)
Atividades Avaliativas 2 Bimestre
5 pages
Cat'n Around Catskill in The Dog Days of Summer
No ratings yet
Cat'n Around Catskill in The Dog Days of Summer
2 pages
Five Children and It
No ratings yet
Five Children and It
5 pages
Business Strategy
No ratings yet
Business Strategy
18 pages
Wish List
100% (1)
Wish List
18 pages
105406
No ratings yet
105406
39 pages
Unit 3 Chemical Pathways
0% (1)
Unit 3 Chemical Pathways
28 pages
Lumpinou_InstalledFileLog_Results
No ratings yet
Lumpinou_InstalledFileLog_Results
28 pages
9.1-Come Follow Me - PDF (RE)
No ratings yet
9.1-Come Follow Me - PDF (RE)
105 pages

Confidence Intervals

Uploaded by

Confidence Intervals

Uploaded by

Confidence Intervals

Quantifying the sampling variability of a statistic.

The process of generalizing from a statistic of a sample to a

This sampling distribution captures the two sources of error

To focus on the variation, let’s introduce a second example, one

A Simple Random Sample

Restaurants in San Francisco

The Population Distribution

Empirical Dist. (Sample 1)

The sample statistics here are:

The Sampling Distribution

• The mean of the sampling distribution is 87.6.

Observe that the sampling distribution of 𝑥̄ doesn’t look any-

Putting the Three Panels Together

Population Empirical Sampling

1. 𝜇 and the mean of the sampling distribution are roughly

3. The histogram of the sample averages is not skewed like

Up until this point, we’ve worked through this thought exper-

• Draw 𝑁 tickets from the box (that is, 𝑛 = 𝑁).

What happens to the SE in these two extreme cases?

Drawing on our understanding of the thought-experiment, we

To start, we know we can use the sample average, 𝑥,̄ to infer

The thought-experiment tells us that 𝑠 is close to the 𝜎 (when

When presenting our findings, you might say, that based on a

Population Size (𝑁 ) Sample Size (𝑛)

What do you notice about the relationship between sample size

• The absolute size of the population doesn’t enter into the

Confidence intervals bring in more information from the

This is the Central Limit Theorem in action. The CLT states

Normal Confidence Intervals

[𝑥̄ − 1.96𝑆𝐸, 𝑥̄ + 1.96𝑆𝐸] The number 1.96 doesn’t come out

I am 95% confident that the population mean is

Confidence not Probability

As we look up the graph through the remaining intervals, we

Inference for a Population Proportion

To gain practice with making confidence intervals, we turn to

Population Empirical Sampling

In the special case of a 0-1 box:

• The population average is the proportion of 1s in the box,

With an equation to estimate 𝑆𝐸 from our data in hand, we

In these notes, we have restricted ourselves to the simple ran-

So the standard error can be calculated as:

𝑆𝐸(𝑝)̂ = √𝑉 𝑎𝑟(𝑝)̂ = √ 𝑝(1−𝑝)

When estimating the SE from data, we plug in 𝑝̂ for 𝑝.

• We’re approximating the shape of the unknown sampling

There are times when we are unwilling to make the assumption

You might also like