0% found this document useful (0 votes)
2 views

11. One sample inference

The document discusses statistical inference about means, specifically focusing on the use of the t-distribution for calculating confidence intervals and conducting one-sample t-tests. It provides examples, including the analysis of flying snake undulation rates and human body temperature, illustrating how to estimate means and variances using sample data. The document emphasizes the importance of normal distribution and the assumptions required for valid t-tests.

Uploaded by

hikkihikaruww
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

11. One sample inference

The document discusses statistical inference about means, specifically focusing on the use of the t-distribution for calculating confidence intervals and conducting one-sample t-tests. It provides examples, including the analysis of flying snake undulation rates and human body temperature, illustrating how to estimate means and variances using sample data. The document emphasizes the importance of normal distribution and the assumptions required for valid t-tests.

Uploaded by

hikkihikaruww
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

µ = 67.

4
σ = 3.9

Numerical variables from a €



single sample Y is normally distributed
whenever:
Y is normally distributed
Chapter 11 or
µ = 67.4
µY = € n is large
σ 3.9
σY = = = 1.7
n 5

Inference about means But... We don’t know s . . .


Because Y is normally distributed, we can convert
its distribution to a standard normal distribution:
However, we do know s, the standard deviation
of our sample. We can use that as an estimate
€ Y −µ Y −µ of s.
Z= =
σY σ n

This would give a probability distribution of the


difference between a sample mean and the population
mean.

In most cases, we
A good approximation to the
N
don’t know the real
µ = 67.4 population
distribution.
standard normal is then:
σ = 3.9


Y −µ Y −µ
€ We only have a
sample.
t= =
SE Y s/ n
Y = 67.1 s = 3.1
s 3.1
SE Y = = = 1.4
n 5
€ €
We use this as an
estimate of σ Y

t has a Student’s t €
distribution Degrees of freedom
Discovered by William Gossett, of
the Guinness Brewing Company

df = n - 1
}

Y −µ
Z=
σY
Y −µ
t= Z t9
SE Y

We use the t-distribution to calculate a 95% confidence interval for a
confidence interval of the mean mean

Example:
Y −µ Paradise flying snakes
−tα ( 2),df < < tα ( 2),df
SE Y
We rearrange the above to generate:

Y − tα ( 2),df SE Y < µ < Y + tα ( 2),df SE Y


€ Undulation rates (in Hz)
Another way to express this is: Y ± SE Y tα ( 2),df 0.9, 1.4, 1.2, 1.2, 1.3, 2.0, 1.4, 1.6


Estimate the mean and standard
deviation

Y = 1.375
https://www.youtube.com/watch?v=16aGSx9gFO4
s = 0.324
n=8


Find the critical value of t
Find the standard error

df = n − 1 = 7
Y ± SE Y tα ( 2),df
tα ( 2),df = t 0.05( 2),7
= 2.36
s 0.324
SE Y = = = 0.115
n 8

€ Table C: Student's t distribution


Finding the critical value of t in R

qt(0.025, df = 7, lower.tail = FALSE)


[1] 2.364624
Putting it all together... 99% confidence interval
tα ( 2),df = t 0.01( 2),7 = 3.50
Y ± SE Y tα ( 2),df = 1.375 ± 0.115 (2.36)
= 1.375 ± 0.271 Y ± SE Y tα ( 2),df = 1.375 ± 0.115 ( 3.50)
€ = 1.375 ± 0.403
1.10 < µ < 1.65
(95% confidence interval) 0.97 < µ < 1.78

Confidence interval for mean€


in R Confidence interval for the variance

t.test(undulationRate)$conf.int
[1] 1.104098 1.645902 df s2 2 df s2
attr(,"conf.level") 2
≤σ ≤ 2
[1] 0.95 χα χ α
,df 1− ,df
2 2

t.test(undulationRate,
conf.level = 0.99)$conf.int
[1] 0.9740838 1.7759162
attr(,"conf.level") €
[1] 0.99
2.5%
α = 0.05 95% confidence interval for the variance
of flying snake undulation rate
2.5%
Frequency
df s2 2 df s2
2
≤σ ≤ 2
χα χ α
,df 1− ,df
2 2

df = n - 1 = 7
χ2
χ2 1- α/2 χ2 α/2 € s2 = (0.324)2 = 0.105

χ α2 2
= χ 0.025,7 = 16.01
,df
2

χ2 α 2
= χ 0.975,7 = 1.69
1− ,df > qchisq(0.025,df=7,lower.tail=FALSE)
2
[1] 16.01276
Table A
df > qchisq(0.975,df=7,lower.tail=FALSE)
X 0.999 0.995 0.99 0.975 0.95 0.05 0.025 0.01 0.005 0.001 [1] 1.689869
1 1.6 3.9E-5 0.00016 0.00098 0.00393 3.84 5.02 6.63 7.88 10.83
E-6

2 0 0.01 0.02 0.05 0.1 5.99 7.38 9.21 10.6 13.82
3 0.02 0.07 0.11 0.22 0.35 7.81 9.35 11.34 12.84 16.27
4 0.09 0.21 0.3 0.48 0.71 9.49 11.14 13.28 14.86 18.47
5 0.21 0.41 0.55 0.83 1.15 11.07 12.83 15.09 16.75 20.52
6 0.38 0.68 0.87 1.24 1.64 12.59 14.45 16.81 18.55 22.46
7 0.6 0.99 1.24 1.69 2.17 14.07 16.01 18.48 20.28 24.32
8 0.86 1.34 1.65 2.18 2.73 15.51 17.53 20.09 21.95 26.12
95% confidence interval for the variance One-sample t-test
of flying snake undulation rate

df s2 2 df s2 The one-sample t-test compares the mean of a random


≤σ ≤ 2 sample from a normal population with the population mean
χ α2 χ α
,df 1− ,df proposed in a null hypothesis.
2 2

2 2
7 (0.324 ) 7 (0.324 )
≤σ2 ≤
16.01 1.69

0.0459 ≤ σ 2 ≤ 0.435

Test statistic for one-sample t-test Hypotheses for one-sample t-tests

Y − µ0 H0 : The mean of the population is µ0.


t=
s/ n
HA: The mean of the population is not µ0.

µ0 is the mean value proposed by H0



Human body temperature
Example: Human body temperature
n = 24
Y = 98.28
H0 : Mean healthy human body
temperature is 98.6ºF. s = 0.940
HA: Mean healthy human body
temperature is not 98.6ºF.
Y − µ0 98.28 − 98.6
t=€ = = −1.67
s/ n 0.940 / 24

€ Comparing t to its distribution


Degrees of freedom to find the P-value

df = n – 1 = 23
A portion of the t table

df α(1) α(1) α(1) α(1) α(1)


=0.1 =0.05 =0.025 =0.01 =0.005
α(2)=0.2 α(2)=0.10 α(2)=0.05 α(2)=0.02 α(2)=0.01
... ... ... ... ... ...
20 1.33 1.72 2.09 2.53 2.85
21 1.32 1.72 2.08 2.52 2.83
22 1.32 1.72 2.07 2.51 2.82
23 1.32 1.71 2.07 2.5 2.81
24 1.32 1.71 2.06 2.49 2.8
25 1.32 1.71 2.06 2.49 2.79

t.test(bodyTempSmallData$temperature, mu =
98.6)

One Sample t-test


–1.67 is closer to 0 than –2.07, so P > 0.05.
data: bodyTempSmallData$temperature
With these data, we cannot reject the null t = -0.56065, df = 24, p-value = 0.5802
alternative hypothesis: true mean is not equal
hypothesis that the mean human body to 98.6
temperature is 98.6. 95 percent confidence interval:
98.24422 98.80378
sample estimates:
mean of x
98.524
Body temperature revisited:
Body temperature revisited: n = 130 n = 130
n = 130
Y = 98.25 t = −5.44
s = 0.733 t 0.05(2),129 = ±1.98

Y − µ0 98.25 − 98.6 €
t is further out in the tail than the critical value, so we
t=€ = = −5.44 could reject the null hypothesis. Human body
s / n 0.733/ 130 temperature is not 98.6ºF. P = 2.4 x 10–7

One-sample t-test: Assumptions

1. The variable is normally distributed.

2. The sample is a random sample.

You might also like