0% found this document useful (0 votes)
100 views

Lecture4 ConfidenceIntervals PDF

The document summarizes key concepts about confidence intervals (CIs): 1) CIs provide a range of possible values for a parameter rather than a single point estimate. They have an interval [L,U] rather than a single value. 2) The confidence level α represents the probability that the true parameter falls within the CI. It is not the probability that the parameter is inside the interval. 3) Formulas for CIs of the population mean were derived under different assumptions about the population distribution and whether the variance is known. The t-distribution is used when the variance is unknown. 4) CIs for the population variance were also discussed.

Uploaded by

ud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Lecture4 ConfidenceIntervals PDF

The document summarizes key concepts about confidence intervals (CIs): 1) CIs provide a range of possible values for a parameter rather than a single point estimate. They have an interval [L,U] rather than a single value. 2) The confidence level α represents the probability that the true parameter falls within the CI. It is not the probability that the parameter is inside the interval. 3) Formulas for CIs of the population mean were derived under different assumptions about the population distribution and whether the variance is known. The t-distribution is used when the variance is unknown. 4) CIs for the population variance were also discussed.

Uploaded by

ud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Lecture 4

Confidence Intervals
Lecture Summary
• Last lecture, we talked about summary statistics
and how “good” they were in estimating the
parameters
– Risk, bias, and variance
– Sampling distribution

• Another quantitative measure of how “good” the


statistic is called confidence intervals (CI)

• CIs provide an interval of certainty about the


parameter
Introduction
• Up to now, we obtained point estimates for
parameters from the sample 𝑋1 , … , 𝑋𝑛
– Examples: sample mean, sample variance, sample median,
sample quantile, IQR, etc.
– They are called point estimates because they provide one
single point/value/estimate about the parameter
– Mathematically: 𝑇 𝑋1 , … , 𝑋𝑛 → a single point!

• However, suppose we want a range of possible


estimates for the parameter, an interval estimate like
[𝐿, 𝑈]
– Mathematically: 𝐿 𝑋1 , … , 𝑋𝑛 and 𝑈(𝑋1 , … , 𝑋𝑛 )
Two-Sided Confidence Intervals
• Data/Sample: (𝑋1 , … , 𝑋𝑛 ) ∼ 𝐹𝜃 Confidence Level
– 𝜃 is the parameter

• Two-Sided Confidence Intervals: A 𝛼-confidence


interval is a random interval, 𝐿, 𝑈 , from the
sample where the following holds
𝑃 𝐿 ≤𝜃 ≤𝑈 ≥1−𝛼
– Interpretation: the probability of the interval covering
the parameter must exceed 1 − 𝛼
– It is NOT the probability of the parameter being inside
the interval!!! Why?
Comments about CIs
• Pop quiz 1: What is the confidence level, 𝛼, for [−∞, ∞]
confidence interval?
– Thus, for any level 𝛼, [−∞, ∞] CI would be a valid (but terrible)
CI

• Pop quiz 2: What is the confidence level, 𝛼, for [𝑎, 𝑎] CI


where 𝑎 is any number?

• Pop quiz 3: Suppose you have two confidence intervals


[𝐿1 , 𝑈1 ] and [𝐿2 , 𝑈2 ]. If the first CI is shorter than the
second, what does this imply?
– If 𝛼 is the same for both intervals, what would this imply about
the short interval (in comparison to the longer interval)?

• Main point: given some confidence level 𝛼, you want to


obtain the shortest CI
CIs for Population Mean
• Case 1: If the population is Normal and 𝜎 2 is
known
𝜎
CI: 𝑋 ± 𝑍 1−
𝛼
2
𝑛
Hint: Use sampling distribution of 𝑋
• Case 2: If the population is not Normal and 𝜎 2
is known
𝜎
Approximate CI: 𝑋 ± 𝑍 1−
𝛼
2
𝑛
Hint: Use CLT of 𝑋
What if the 2
variance,𝜎 ,is unknown?
t Distribution
• Formal Definition: A random variable 𝑋 has a t-distribution with 𝑛 degrees
of freedom, denoted as 𝑡𝑛 , with the probability density function
𝑛+1 𝑛+1
− 2
Γ 𝑥 2
𝑓 𝑥 = 2 1 +
𝑛 𝑛
𝑛𝜋Γ
2
where Γ() is a gamma function

• Useful Definition: Consider the following random variable


𝑍 Notice that you can
You will prove the 𝑋= transform 𝑋 − 𝜇 into a
relation between the two 𝑉 standard Normal
in the homework 𝑛
where 𝑍 ∼ 𝑁 0,1 , 𝑉 ∼ 𝜒𝑛2 and 𝑍 and 𝑉 are independent.
Then, 𝑋 ∼ 𝑡𝑛
• “Quick and Dirty” Definition: If 𝑋1 , … , 𝑋𝑛 ∼ 𝑁 𝜇, 𝜎 2 , i.i.d., then
𝑋−𝜇
∼ 𝑡𝑛−1 From lecture 3, 𝜎 2 is
𝜎 2
𝜒𝑛−1 , with some constant
𝑛 multipliers
Property of the t Distribution
• The t-distribution has a fatter tail than the normal
distribution (see picture from previous slide)
– Consequences: The “tail” probabilities for the t distribution is
bigger than that from the normal distribution!

• If the degrees of freedom goes to ∞, then


lim 𝑡𝑛 → 𝑁 0,1
𝑛→∞
Proof: CLT!

• This means that with large sample size (𝑛), we can


approximate 𝑡𝑛 with a standard normal distribution
𝑃 𝑡𝑛 ≤ 𝑥 ≈ 𝑃 𝑍 ≤ 𝑥
for large 𝑛
– General rule of thumb for how large 𝑛 should be: 𝑛 ≥ 30
CIs for Population Mean
• Case 3: If the population is Normal and variance is
unknown
𝜎
CI: 𝑋 ± 𝑡 𝛼
1− 2 𝑛
Hint: Use the “quick and dirty” version of the
t-distribution
• Case 4: If the population is not Normal and variance is
unknown (i.e. the “realistic” scenario)
𝜎
Approximate CI: 𝑋 ± 𝑧 𝛼
1− 2 𝑛
Hint: Use CLT!
– Demo in class
Summary of CIs for the Population Mean
Fixed width CI
Scenarios CI Derivation

1) Population is Normal 𝜎 Use sampling distribution for


𝑋±𝑍 1−
𝛼
2) Variance is known 2 𝑛 𝑋

1) Population is not Normal 𝜎 Approximate CI, use CLT


𝑋±𝑍 𝛼
1 −2
2) Variance is known 𝑛

1) Population is Normal 𝜎 Use the t distribution


𝑋±𝑡 𝛼
1− 2 𝑛
2) Variance is unknown

1) Population is not Normal 𝜎 Approximate CI, use CLT


𝑋±𝑧 𝛼
1− 2
2) Variance is unknown 𝑛
Variable width CI
CIs for Population Variance
• Case I: If the population is Normal and all
parameters are unknown
2 2
[ 𝑛 − 1 𝜎 2 /𝜒𝑛−1 1−
2
𝛼 , 𝑛 − 1 𝜎 /𝜒𝑛−1 𝛼 ]
2 2
– Hint: Use the sampling distribution for 𝜎 2

• Case II: (Homework question) If the population is


Normal and the population mean is known.
1 𝑛 2 and the sampling
– Hint:Use 𝑋
𝑖=1 𝑖 − 𝜇
𝑛
distribution related to it!
Lecture Summary
• Another quantitative measure of how “good” the
statistic is called confidence intervals (CI)

• CIs provide an interval of certainty about the


parameter

• We derived results for the population mean and the


population variance, under various assumptions about
the population
– Normal vs. not Normal
– known variance vs. unknown variance
Extra Slides
One-Sided Confidence Intervals
• One-Sided Confidence Intervals: A 𝛼-confidence
interval is a random interval, 𝐿, ∞ , from the
sample where the following holds
𝑃 𝐿 ≤𝜃 ≥1−𝛼

• One-Sided Confidence Intervals: A 𝛼-confidence


interval is a random interval, −∞, 𝑈 , from the
sample where the following holds
𝑃 𝜃 ≤𝑈 ≥1−𝛼

You might also like