0% found this document useful (0 votes)
32 views

STAT2225 Module 6. Intro To Inferential Statistics

Uploaded by

Jona
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

STAT2225 Module 6. Intro To Inferential Statistics

Uploaded by

Jona
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Introduction to

Inferential Statistics
Jomel R. Alanzalon
STAT 2225: Elementary Statistics & Probability
2nd Semester, A.Y. 2023-2024

CENTRAL LUZON STATE UNIVERSITY


DEPARTMENT of
STATISTICS Statistical Inference and its Areas
Statistical inference is the process of using sample results to draw
conclusion about the characteristics of the population.

- to obtain a guess or an estimate


Estimation of of the unknown value along with
Parameter the determination of its accuracy
Statistical
Inference
- to examine whether the sample
Hypothesis
data support or contradict the
Testing
investigators conjecture about the
true value of the parameter.
Introduction to Inferential Statistics | 2
DEPARTMENT of
STATISTICS Estimation
Important Terms
Estimator
• formula or process for using sample data to estimate a population
parameter
Estimate
• specific value or range of values used to approximate a population
parameter
• Point Estimate – single value
• Interval Estimate – range of values
Standard Error
• standard deviation of an estimator
Introduction to Inferential Statistics | 3
DEPARTMENT of
STATISTICS Estimation
Sampling Distribution
• The sampling distribution of a statistic (such as mean, variance,
etc.) is the distribution of all values of the statistic when all possible
samples of the same size 𝑛 are taken from the same population

Illustration: Sampling Distribution of the Mean

Introduction to Inferential Statistics | 4


DEPARTMENT of
STATISTICS Estimation
Types of Estimation
1. Point Estimation
• we are computing a single value from a sample data to
estimate the population parameter.

2. Interval Estimation
• we are producing interval or range of values that is likely to
contain the true value of the parameter.

Introduction to Inferential Statistics | 5


DEPARTMENT of
STATISTICS Estimation
Properties of a Good Estimator
1. The estimator should be an unbiased estimator. That is, the expected
value or the mean of the estimates obtained from samples of a given size
is equal to the parameter being estimated.
2. The estimator should be consistent. For a consistent estimator, as
sample size increases, the value of the estimator approaches the value of
the parameter being estimated.
3. The estimator should be a relatively efficient estimator. That is, of all the
statistics that can be used to estimate a parameter, the relatively efficient
estimator has the smallest variance.

Introduction to Inferential Statistics | 6


DEPARTMENT of
STATISTICS Point Estimation
- we are computing a single value from a sample data to estimate
the population parameter.

Point Estimator
• the rule or formula that describes the calculation of a single
value estimate

Point Estimate
• the calculated single value used to estimate

Introduction to Inferential Statistics | 7


DEPARTMENT of
STATISTICS Point Estimation
Examples of Point Estimator and Point Estimates
Statistic Parameter
Sample Mean (𝑥)ҧ Population Mean (𝜇)
Sample Variance (𝑠 2 ) Population Variance (𝜎 2 )
Sample Standard Deviation (𝑠) Population Standard Deviation (𝜎)
Point Estimators
Sample Proportion (𝑝)Ƹ Population Proportion (𝑝)

ഥ = 𝟐𝟓. 𝟖
𝑿
Point Estimator Point Estimate

Introduction to Inferential Statistics | 8


DEPARTMENT of
STATISTICS Point Estimation
Examples of Point Estimator and Point Estimates
• The point estimator for the population mean 𝜇 is the sample mean 𝑿 ഥ
∑𝑋𝑖 computed value will be the point
𝑋ത = estimate of population mean 𝜇
𝑛
• The point estimator for the population variance 𝜎 2 is the sample variance 𝒔𝟐
∑ 𝑋 𝑖 − ത
𝑋 2
computed value will be the point
2
𝑠 = estimate of population variance 𝜎 2
𝑛−1
• The point estimator for the population proportion 𝑝 is the sample

proportion 𝒑
𝑥 computed value will be the point
𝑝Ƹ = estimate of population proportion 𝑝
𝑛

Introduction to Inferential Statistics | 9


DEPARTMENT of
STATISTICS Interval Estimation
- two values are calculated to form an interval or range within
which the parameter is expected to lie/fall

Interval Estimator
• the rule or formula that describes the calculation of interval
of values

Interval Estimate
• the calculated interval or range of values used to estimate
• also known as confidence interval (CI)

1
Introduction to Inferential Statistics |
0
DEPARTMENT of
STATISTICS Interval Estimation
Important Concepts in Interval Estimation
Confidence Interval (CI)
• range/interval of values used to estimate the true value of a
population parameter
• It gives us a much better sense of how good an estimate is

Confidence Level (𝟏 − 𝜶)
• probability or percentage that the confidence interval actually does
contain the population parameter
• also known as degree of confidence or confidence coefficient
• 𝛼 is the significance level
1
Introduction to Inferential Statistics |
1
DEPARTMENT of
STATISTICS Interval Estimation
Commonly used confidence level and its level of significance
Confidence Level Significance Level
𝟏−𝜶 𝜶
90% 10%
95% 5%
99% 1%

95% is commonly used because it provides a good balance and precision.


If the confidence level is not given, the default value we use is 95%.

1
Introduction to Inferential Statistics |
2
DEPARTMENT of
STATISTICS Interval Estimation
Important Concepts in Interval Estimation
Critical Value
• number on the borderline separating sample statistics that are likely
to occur from those that are unlikely to occur
• In interval estimation the critical value can be find using
- Z-distribution (positive 𝑍𝛼 ) if 𝝈 is known
2

- Student t-distribution (𝑡𝛼, 𝑛−1 ) if 𝝈 is unknown


2

- other statistical tables


1
Introduction to Inferential Statistics |
3
DEPARTMENT of
STATISTICS

Most Common Critical Value in Z-distributions


Confidence Level Significance Level Critical Value
𝟏−𝜶 𝜶 𝒁𝜶
𝟐
90% 10% 𝑍0.05 = 𝟏. 𝟔𝟒𝟓
95% 5% 𝑍0.025 = 𝟏. 𝟗𝟔
98% 2% 𝑍0.01 = 𝟐. 𝟑𝟐𝟔
99% 1% 𝑍0.005 = 𝟐. 𝟓𝟕𝟓

used when 𝝈 is known or given

1
Introduction to Inferential Statistics |
4
DEPARTMENT of
STATISTICS

𝜶
Student t-distribution
- t-table
- used when 𝝈 is unknown
or not given

Example:
𝒅𝒇
𝒕𝟎.𝟎𝟓, 𝟏𝟖 = 𝟏.
𝟏. 𝟕𝟑𝟒
𝟕𝟑𝟒

1
Introduction to Inferential Statistics |
5
DEPARTMENT of
STATISTICS Interval Estimation
Important Concepts in Interval Estimation
Margin of Error (E)
• maximum error of the estimate; maximum likely difference between the
point estimate of a parameter and the actual value of the parameter
• computed as the product of critical value and standard error

1
Introduction to Inferential Statistics |
6
DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Mean 𝝁
Confidence Interval for the Population Mean 𝝁

ഥ−𝑬<𝝁<𝑿
𝑿 ഥ+𝑬
lower limit of the interval upper limit of the interval

Requirements/Assumptions
a. The sample is a random sample.
b. The population is normally distributed or the sample size is large
(𝑛 > 30).

1
Introduction to Inferential Statistics |
7
DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Mean 𝝁
Confidence Interval for the Population Mean 𝝁
Steps in Constructing CI for the Population Mean 𝝁
1. Verify if the requirements are met.
2. Evaluate the margin of error (𝐸).
a. Case 1: When 𝜎 is known
𝝈
𝑬 = 𝒁𝜶 ⋅
𝟐 𝒏
b. Case 2: When 𝜎 is unknown
𝒔
𝑬 = 𝒕𝜶, 𝒏−𝟏 ⋅
𝟐 𝒏
ഥ−𝑬<𝝁<𝑿
3. Solve for the limits: 𝑿 ഥ+𝑬
1
Introduction to Inferential Statistics |
8
DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Mean 𝝁
Confidence Interval for the Population Mean 𝝁
4. Round-off the confidence interval limits.
• When the given is the original set of data, round the limits to one
more decimal place than used in data.
• When the given is only the summary statistics (mean and standard
deviation), round the limits to the same number of decimal places
used for the sample mean.
5. Interpret the confidence interval.
“We are (𝟏 − 𝜶)% confident that the interval from 𝑿 ഥ − 𝑬 to 𝑿 ഥ + 𝑬 actually
does contain the true value of the statistic.”
1
Introduction to Inferential Statistics |
9
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Example 1.
A random sample of 100 fish caught at Taal Lake has a mean
length of 35.5 cm. Assuming that it is known that the lengths of
the population of fish in Taal Lake follows a normal distribution
with a population standard deviation of 5 cm, construct a 95%
confidence interval for the mean length of all fish in Taal Lake.

Solution:
Given: 𝑛 = 100, 𝑋ത = 35.5, 𝜎=5
1 − 𝛼 = 0.95 → 𝛼 = 0.05
2
Introduction to Inferential Statistics |
0
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 100, 𝑋ത = 35.5, 𝜎 = 5, 𝛼 = 0.05

Since 𝜎 is given in the problem, we


will use Case 1 to calculate the 𝜎
𝐸 = 𝑍𝛼 ⋅
margin of error. 2 𝑛
5
The critical value is = 𝑍0.05 ⋅
Confidence Significance Critical Value 2 100
Level Level 𝒁𝜶 5
𝟏−𝜶 𝜶 𝟐 = 𝑍0.025 ⋅
90% 10% 𝑧0.05 = 𝟏. 𝟔𝟒𝟓
100
5
95% 5% 𝑍0.025 = 𝟏. 𝟗𝟔 = 1.96 ⋅
98% 2% 𝑍0.01 = 𝟐. 𝟑𝟐𝟔 100
99% 1% 𝑍0.005 = 𝟐. 𝟓𝟕𝟓 𝐸 = 0.98
2
Introduction to Inferential Statistics |
1
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 100, 𝑋ത = 35.5, 𝜎=5 𝛼 = 0.05, 𝐸 = 0.98

Solve for limits.


𝑋ത − 𝐸 < 𝜇 < 𝑋ത + 𝐸
35.5 − 0.98 < 𝜇 < 35.5 + 0.98
34.52 < 𝜇 < 36.48
𝟑𝟒. 𝟓 < 𝝁 < 𝟑𝟔. 𝟓

We are 95% confident that the interval from 34.5 cm to 36.5 cm actually
does contain the true value of the mean length of all fish in Taal lake.

2
Introduction to Inferential Statistics |
2
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Example 2.
A random sample of 100 fish caught at Taal Lake has a mean
length of 35.5 cm. Assuming that it is known that the lengths of
the population of fish in Taal Lake follows a normal distribution
with a population standard deviation of 5 cm, construct a 99%
confidence interval for the mean length of all fish in Taal Lake.

Solution:
Given: 𝑛 = 100, 𝑋ത = 35.5, 𝜎=5
1 − 𝛼 = 0.99 → 𝛼 = 0.01
2
Introduction to Inferential Statistics |
3
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 100, 𝑋ത = 35.5, 𝜎 = 5, 𝛼 = 0.01

Since 𝜎 is given in the problem, we


will use Case 1 to calculate the 𝜎
𝐸 = 𝑍𝛼 ⋅
margin of error. 2 𝑛
5
The critical value is = 𝑍0.01 ⋅
Confidence Significance Critical Value 2 100
Level Level 𝒁𝜶 5
𝟏−𝜶 𝜶 𝟐 = 𝑍0.005 ⋅
90% 10% 𝑧0.05 = 𝟏. 𝟔𝟒𝟓
100
5
95% 5% 𝑍0.025 = 𝟏. 𝟗𝟔 = 2.575 ⋅
98% 2% 𝑍0.01 = 𝟐. 𝟑𝟐𝟔 100
99% 1% 𝑍0.005 = 𝟐. 𝟓𝟕𝟓 𝐸 = 1.2875
2
Introduction to Inferential Statistics |
4
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 100, 𝑋ത = 35.5, 𝜎=5 𝛼 = 0.01, 𝐸 = 1.2875

Solve for limits.


𝑋ത − 𝐸 < 𝜇 < 𝑋ത + 𝐸
35.5 − 1.2875 < 𝜇 < 35.5 + 1.2875
34.2125 < 𝜇 < 36.7875
𝟑𝟒. 𝟐 < 𝝁 < 𝟑𝟔. 𝟖
We are 99% confident that the interval from 34.2 cm to 36.8 cm actually
does contain the true value of the mean length of all fish in Taal lake.

Note: As the confidence level increases, the confidence interval goes wider.
2
Introduction to Inferential Statistics |
5
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Example 3.
A study found that the body temperatures of healthy adults are
normally distributed. A random sample of 15 adults have a
sample mean of 98.20 degrees Fahrenheit and the sample
standard deviation was 0.62 degrees Fahrenheit. At 95% level of
confidence, construct a CI for the population mean of all body
temperatures of healthy adults.

Solution:
Given: 𝑛 = 15, 𝑋ത = 98.20, 𝑠 = 0.62
1 − 𝛼 = 0.95 → 𝛼 = 0.05
2
Introduction to Inferential Statistics |
6
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 15, 𝑋ത = 98.20, 𝑠 = 0.62, 𝛼 = 0.05

𝑠
Since 𝜎 is unknown, we will use Case 𝐸 = 𝑡𝛼 , 𝑛−1 ⋅
2 𝑛
2 to calculate the margin of error. 0.62
= 𝑡0.05 ⋅
, 15−1 15
2
0.62
= 𝑡0.025, 14 ⋅
15

2
Introduction to Inferential Statistics |
7
DEPARTMENT of
STATISTICS

Standard t-distribution
- t-table
- used when 𝝈 is unknown
or not given

Critical Value

𝒕𝟎.𝟎𝟐𝟓, 𝟏𝟒 = 𝟐.
𝟏. 𝟏𝟒𝟓
𝟕𝟑𝟒

2
Introduction to Inferential Statistics |
8
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 15, 𝑋ത = 98.20, 𝑠 = 0.62, 𝛼 = 0.05

𝑠
Since 𝜎 is unknown, we will use Case 𝐸 = 𝑡𝛼 , 𝑛−1 ⋅
2 𝑛
2 to calculate the margin of error. 0.62
= 𝑡0.05 ⋅
, 15−1 15
2
0.62
= 𝑡0.025, 14 ⋅
15
0.62
= 2.145 ⋅
15
𝐸 = 0.3434
2
Introduction to Inferential Statistics |
9
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑛 = 15, 𝑋ത = 98.20, 𝑠 = 0.62 𝛼 = 0.05, 𝐸 = 0.3434

Solve for limits.


𝑋ത − 𝐸 < 𝜇 < 𝑋ത + 𝐸
98.20 − 0.3434 < 𝜇 < 98.20 + 0.3434
97.8566 < 𝜇 < 98.5434
𝟗𝟕. 𝟖𝟔 < 𝝁 < 𝟗𝟖. 𝟓𝟒

We are 95% confident that the interval from 97.86⁰F to 98.54⁰F actually
does contain the true value of the mean body temperature of healthy
adults.
3
Introduction to Inferential Statistics |
0
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Example 4.
The football coach randomly selected nine players and timed how long
each player took to perform a certain drill. Assume that the data are
normally distributed. Times (in minutes) were as follows:

33 9.7 8.4 11.8 7 6.5 11.1 10.4 12.4


a. Find the point estimate for the mean length of time it took a player to
perform the drill.
b. Construct a 90% confidence interval for the mean length of time it took
a player to perform the drill.

3
Introduction to Inferential Statistics |
1
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Solution: a. Point Estimate

ഥ , thus
The point estimator of the population mean 𝜇 is the sample mean 𝑿

∑𝑋𝑖 33 + 9.7 + 8.4 + 11.8 + 7 + 6.5 + 11.1 + 10.4 + 12.4


𝑋ത = =
𝑛 9
ഥ = 𝟏𝟐. 𝟐𝟔 minutes
𝑿

3
Introduction to Inferential Statistics |
2
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Solution: b. 90% Confidence Interval

Given: 𝑛 = 9, 𝑋ത = 12.26, 1 − 𝛼 = 0.90 → 𝛼 = 0.10


𝑠
Since 𝜎 is unknown, we will use Case 𝐸 = 𝑡𝛼 , 𝑛−1 ⋅
2 𝑛
2 to calculate the margin of error. 8.04
= 𝑡0.10 ⋅
, 9−1 9
𝑠 is also not given, but we can 2
8.04
compute it from the data which is = 𝑡0.05, 8 ⋅
𝑠 = 8.04. 9

3
Introduction to Inferential Statistics |
3
DEPARTMENT of
STATISTICS

Standard t-distribution
- t-table
- used when 𝝈 is unknown
or not given

Critical Value

𝒕𝟎.𝟎𝟓, 𝟖 = 𝟏.𝟏.𝟕𝟑𝟒
𝟖𝟔𝟎

3
Introduction to Inferential Statistics |
4
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Solution: b. 90% Confidence Interval

Given: 𝑛 = 9, 𝑋ത = 12.26, 1 − 𝛼 = 0.90 → 𝛼 = 0.10


𝑠
Since 𝜎 is unknown, we will use Case 𝐸 = 𝑡𝛼 , 𝑛−1 ⋅
2 𝑛
2 to calculate the margin of error. 8.04
= 𝑡0.10 ⋅
, 9−1 9
𝑠 is also not given, but we can 2
8.04
compute it from the data which is = 𝑡0.05, 8 ⋅
𝑠 = 8.04. 9
8.04
= 1.860 ⋅
9
𝐸 = 4.9848
3
Introduction to Inferential Statistics |
5
Interval Estimation for Pop’n Mean 𝝁
DEPARTMENT of
STATISTICS

Given: 𝑋ത = 12.26 𝐸 = 4.9848

Solve for limits.


𝑋ത − 𝐸 < 𝜇 < 𝑋ത + 𝐸
12.26 − 4.9848 < 𝜇 < 12.26 + 4.9848
7.2752 < 𝜇 < 17.2448
𝟕. 𝟐𝟖 < 𝝁 < 𝟏𝟕. 𝟐𝟒

We are 90% confident that the interval from 7.28 minutes to 17.24
minutes actually does contain the true value of the mean length of time
a certain drill can be performed.
3
Introduction to Inferential Statistics |
6
DEPARTMENT of
STATISTICS Estimation of the Population Proportion
Approximating the Population Proportion 𝒑
• Point Estimate: Sample Proportion 𝒑

𝒙
ෝ=
𝒑
𝒏
where: 𝑥 = number of sample units that possess the characteristics
of interest
𝑛 = sample size
• Moreover, 𝒒 ෝ is the proportion that do not possess the characteristics of
interest, where 𝒒ෝ=𝟏−𝒑 ෝ.
ෝ𝒒
𝒑 ෝ
• The variance of population proportion 𝑝 is
𝒏
3
Introduction to Inferential Statistics |
7
DEPARTMENT of
STATISTICS Estimation of the Population Proportion
Example 5. A survey was conducted from a certain hospital. The results
shows that out of 1,500 randomly selected operations, 1,432 are successful.
Find the point estimate for the proportion of successful operations in the
hospital.
Solution:
The characteristic of interest here is the successful operations, hence
𝑥 1432
𝑝Ƹ = = = 𝟎. 𝟗𝟓 or 𝟗𝟓%
𝑛 1500
In addition, we can also compute for 𝑞, ො which is the sample proportion of
unsuccessful operations that is
𝑞ො = 1 − 𝑝Ƹ = 1 − 0.95 = 0.05 or 5%
3
Introduction to Inferential Statistics |
8
DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Proportion 𝒑
Confidence Interval for the Population Proportion 𝒑

ෝ−𝑬<𝒑<𝒑
𝒑 ෝ+𝑬

• In this confidence interval, the margin of error is computed as

ෝ𝒒
𝒑 ෝ
𝑬 = 𝒁𝜶
𝟐 𝒏

Introduction to Inferential Statistics | 39


DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Proportion 𝒑

Assumptions for Finding a CI for Population Proportion

1. The sample is a random sample.

2. The confidence interval can only be used if we can assume the sample
proportions follow a normal distribution.

• To verify this condition, we must check that

𝒏ෝ
𝒑 ≥ 𝟓 and 𝒏ෝ
𝒒≥𝟓

Introduction to Inferential Statistics | 40


DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Proportion 𝒑
Example 6. A survey was conducted with 1,404 random respondents and found
that 323 students paid for their education by student loans. Construct a 90%
confidence interval of the true proportion of students who paid for their education
by student loans.
Given: 𝑛 = 1404, 𝑥 = 323, 1 − 𝛼 = 0.90 → 𝛼 = 0.10
𝑥 323
𝑝Ƹ = = = 0.23 → 𝑞ො = 1 − 𝑝Ƹ = 1 − 0.23 = 0.77
𝑛 1404
Solve for margin of error.

𝑝Ƹ 𝑞ො 0.23 0.77 0.23 0.77


𝐸 = 𝑍𝛼 = 𝑍0.10 = 1.645 = 0.0185
2 𝑛 2 1404 1404

Introduction to Inferential Statistics | 41


DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Proportion 𝒑
Solve for limits.
𝑝Ƹ − 𝐸 < 𝑝 < 𝑝Ƹ + 𝐸
0.23 − 0.0185 < 𝑝 < 0.23 + 0.0185
0.2115 < 𝑝 < 0.2485
𝟐𝟏. 𝟏𝟓% < 𝒑 < 𝟐𝟒. 𝟖𝟓%

We are 90% confident that the interval from 21.15% to 24.85% actually
does contain the true proportion of students who paid for their
education in student loans.

Introduction to Inferential Statistics | 42


DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Proportion 𝒑
Example 7. Suppose that a market research firm is hired to estimate the percentage of
adults living in a large city who have cellphones. Five hundred randomly selected adult
residents in this city are surveyed to determine whether they have cellphones. Of the
500 people sampled, 420 responded yes – they own cellphones. Using a 95%
confidence level, compute a confidence interval estimate for the true proportion of
adults residents of this city who have cellphones.
Given: 𝑛 = 500, 𝑥 = 420, 1 − 𝛼 = 0.95 → 𝛼 = 0.05
𝑥 420
𝑝Ƹ = = = 0.84 → 𝑞ො = 1 − 𝑝Ƹ = 1 − 0.84 = 0.16
𝑛 500
Solve for margin of error.

𝑝Ƹ 𝑞ො 0.84 0.16 0.84 0.16


𝐸 = 𝑍𝛼 = 𝑍0.05 = 1.96 = 0.0321
2 𝑛 2 500 500
Introduction to Inferential Statistics | 43
DEPARTMENT of
STATISTICS Interval Estimation for Pop’n Proportion 𝒑
Solve for limits.
𝑝Ƹ − 𝐸 < 𝑝 < 𝑝Ƹ + 𝐸
0.84 − 0.0321 < 𝑝 < 0.84 + 0.0321
0.8079 < 𝑝 < 0.8721
𝟖𝟎. 𝟕𝟗% < 𝒑 < 𝟖𝟕. 𝟐𝟏%

We are 95% confident that the interval from 80.79% to 87.21% actually
does contain the true proportion of adult residents who have cellphones.

Introduction to Inferential Statistics | 44


DEPARTMENT of
STATISTICS
Estimation of the Population Variance and
Standard Deviation

Approximating the Population Variance and Population Standard Deviation

• Point Estimate for Variance: Sample Variance 𝒔𝟐


∑ 𝑿 − ഥ 𝟐
𝑿
𝒔𝟐 =
𝒏−𝟏
• Point Estimate for SD: Sample Standard Deviation 𝒔

∑ 𝑿−𝑿 𝟐
𝒔= 𝒔𝟐 =
𝒏−𝟏

4
Introduction to Inferential Statistics |
5
DEPARTMENT of
STATISTICS
Interval Estimation for 𝝈𝟐 and 𝝈
CI for the Population Variance and Standard Deviation
Formula for the Confidence Interval for a Variance:
𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
< 𝝈𝟐 <
𝝌𝟐𝜶 𝝌𝟐 𝜶
𝟐 𝟏− 𝟐

Formula for the Confidence Interval for a Standard Deviation:

𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
<𝝈<
𝝌𝟐𝜶 𝝌𝟐 𝜶
𝟏−
𝟐 𝟐

Note: degrees of freedom 𝑑𝑓 = 𝑛 − 1

Introduction to Inferential Statistics | 46


DEPARTMENT of
STATISTICS

Example:

Find the critical value of


𝝌𝟐𝟎.𝟎𝟓 at d.f. =15

Answer: 24.996

Introduction to Inferential Statistics | 47


DEPARTMENT of
STATISTICS
Interval Estimation for 𝝈𝟐 and 𝝈
Assumptions for Finding a CI for a Variance or Standard Deviation

1. The sample is a random sample.

2. The population must be normally distributed.

Introduction to Inferential Statistics | 48


DEPARTMENT of
STATISTICS
Interval Estimation for 𝝈𝟐 and 𝝈
Example 8:
Find the 95% confidence interval for the population variance and
standard deviation of the nicotine content of cigarettes manufactured if a
random sample of 20 cigarettes has a standard deviation of 1.6
milligrams. Assume the variable is normally distributed.

Given: 𝑛 = 20, 𝑠 = 1.6, 1 − 𝛼 = 0.95 → 𝛼 = 0.05

Introduction to Inferential Statistics | 49


DEPARTMENT of
STATISTICS
Interval Estimation for 𝝈𝟐 and 𝝈
Solution:
Given: 𝑛 = 20, 𝑠 = 1.6, 𝛼 = 0.05
𝑑𝑓 = 𝑛 − 1 = 19

The critical value for 𝜒𝛼2 with df=19 is


2
𝜒𝛼2 = 𝜒0.05
2 2
= 𝜒0.025 = 32.852
2 2

2
The critical value for 𝜒1− 𝛼 with df=19 is
2
2
𝜒1− 𝛼 = 𝜒 2 0.05 2
= 𝜒0.975 = 8.9066
2 1−
2

Introduction to Inferential Statistics | 50


DEPARTMENT of
STATISTICS
Interval Estimation for 𝝈𝟐 and 𝝈
The 95% confidence interval for the population variance is
𝑛−1 𝑠2 𝑛−1 𝑠2
2 < 𝜎2 <
𝜒𝛼 𝜒2 𝛼
2 1− 2

20−1 1.62 20−1 1.62


2
𝜒0.05
< 𝜎2 < 𝜒2 0.05
2 1− 2

20−1 1.62 20−1 1.62


2 < 𝜎2 < 2
𝜒0.025 𝜒0.975

20−1 1.62 2 20−1 1.62 Therefore, the 95% confidence interval


<𝜎 <
32.852 8.9066 for the population variance for the
𝟏. 𝟒𝟖 < 𝝈𝟐 < 𝟓. 𝟒𝟔 nicotine content is between 1.48 to 5.46.

Introduction to Inferential Statistics | 51


DEPARTMENT of
STATISTICS
Interval Estimation for 𝝈𝟐 and 𝝈
The 95% confidence interval for the population standard deviation is

𝑛−1 𝑠2 𝑛−1 𝑠2
2
𝜒𝛼
<𝜎< 𝜒2 𝛼
2 1− 2

1.48 < 𝜎 < 5.46

𝟏. 𝟐𝟏 < 𝝈 < 𝟐. 𝟑𝟒
Therefore, the 95% confidence interval for the
population standard deviation for the nicotine
content is between 1.21 mg to 2.34 mg.

Introduction to Inferential Statistics | 52


DEPARTMENT of
STATISTICS
Sample Size Determination
• Sample size determination is closely related to statistical estimation.
• Quite often you ask, How large a sample is necessary to make an accurate
estimate?
• The answer is not simple, since it depends on three things: the margin of
error, the population standard deviation, and the degree of confidence.
• For example, how close to the true mean do you want to be (2 units, 5
units, etc.), and how confident do you wish to be (90%, 95%, 99%, etc.)?

Introduction to Inferential Statistics | 53


DEPARTMENT of
STATISTICS
Sample Size Determination (Mean)
Minimum sample size needed for an interval estimate of the pop’n mean:
Remark: Assume that the population standard deviation of the variable is
known or has been estimated from a previous study.
𝑍𝛼ൗ ∙ 𝜎 2
2
𝑛=
𝐸
Where:
𝑍𝛼Τ2 is the critical value
σ is the population standard deviation
𝐸 is the margin of error.
Note: If there is any fraction or decimal portion in the answer, use the next
whole number for sample size 𝑛.
Introduction to Inferential Statistics | 54
DEPARTMENT of
STATISTICS
Sample Size Determination (Mean)
Example 9. A scientist wishes to estimate the average depth of a river. He
wants to be 99% confident that the estimate is accurate within 2 feet. From
a previous study, the standard deviation of the depths measured was 4.33
feet.

Solution:
Given: 𝛼 = 1 − 0.99 = 0.01 → Z0.01Τ2 = 𝑍0.005 = 2.575, σ = 4.33, E = 2
𝑍𝛼ൗ ∙𝜎 2 2.575 4.33 2
2
𝑛= = = 31.08 ≈ 𝟑𝟐
𝐸 2

Therefore, to be 99% confident that the estimate is within 2 feet of the true
mean depth, the scientist needs a sample of at least 32 measurements.

Introduction to Inferential Statistics | 55


DEPARTMENT of
STATISTICS
Sample Size Determination (Proportion)
Minimum sample size needed for an interval estimate of the pop’n
proportion:
𝑍𝛼ൗ 2
2
𝑛 = 𝑝Ƹ 𝑞ො
𝐸
Where:
𝑍𝛼Τ2 is the critical value
𝑝Ƹ is the sample proportion and 𝑞ො = 1 − 𝑝Ƹ
𝐸 is the margin of error.

Note: If there is any fraction or decimal portion in the answer, use the next
whole number for sample size 𝑛.
Introduction to Inferential Statistics | 56
DEPARTMENT of
STATISTICS
Sample Size Determination (Proportion)
Example 10. A researcher wishes to estimate, with 95% confidence, the
proportion of people who own a home computer. A previous study shows
that 40% of those interviewed had a computer at home. The researcher
wishes to be accurate within 2% of the true proportion. Find the minimum
sample size necessary.
Solution: Given: 𝛼 = 0.05 → 𝑍0.05Τ2 = 1.96, 𝑝Ƹ = 0.40 , 𝑞ො = 0.60 and
𝐸 = 2% = 0.02
𝑍𝛼ൗ 2 1.96
2
2
𝑛 = 𝑝Ƹ 𝑞ො = (0.40)(0.60) = 2304.96 ≈ 𝟐𝟑𝟎𝟓
𝐸 0.02
Therefore, to be 95% confident that the estimate is within 2% of the true
proportion, the researcher must interview 2305 people.

Introduction to Inferential Statistics | 57

You might also like