0% found this document useful (0 votes)

12 views26 pages

TP_stat_inf_104241

Uploaded by

Uriel Johnson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views26 pages

TP_stat_inf_104241

Uploaded by

Uriel Johnson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Statistical and Mathematical Analysis

Your Name

November 26, 2024

Abstract

This document presents an in-depth statistical and mathematical analysis of...

SECTION A

A1. Density Histogram and Estimation of the Mean

Suppose that we have a sample of observations x1 , x2 , . . . , xn obtained by random sam-
pling from a continuous distribution with probability density function f (x).
The range [a1 , aK+1 ] is divided evenly into K subintervals, or bins, Bk = (ak , ak+1 ], where
k = 1, 2, . . . , K, each of width h. Recall that the density histogram based on these bins
is defined as:
(
vk
for x ∈ Bk ,
Hist(x) = nh
0 otherwise,
where vk is the frequency count of the number of data points that fall in bin Bk , and n
is the total number of data points.

(i) Definition of the Notation and Equation for h

In the formula for the histogram, the notation vk refers to the number of observations
that fall within the bin Bk = (ak , ak+1 ]. The width of each bin, h, can be expressed in
terms of ak and ak+1 as:

h = ak+1 − ak for k = 1, 2, . . . , K.

(ii) Estimating the Mean Using the Histogram

The distribution mean (or population mean) is defined as:

1
Group II Inferential statistics Exam 2017-2018

Z ∞
µ= xf (x) dx.
−∞

One way of estimating the population mean is by substituting Hist(x) as an estimate of

the probability density function f (x). This gives an estimate of the mean as:
Z aK+1
µ̂Hist = x Hist(x) dx.
a1

Using the formula for Hist(x) in the intervals Bk = (ak , ak+1 ], we can write this as:

Z aK+1 K
X vk
µ̂Hist = x· · 1Bk (x) dx,
a1 k=1
nh
where 1Bk (x) is the indicator function that is 1 if x ∈ Bk and 0 otherwise.
This integral simplifies as:

K Z ak+1
1 X
µ̂Hist = vk x dx.
nh k=1 ak

We now compute the integral for each bin Bk = (ak , ak+1 ]:

ak+1 ak+1
x2 a2k+1 a2k
Z
x dx = = − .
ak 2 ak 2 2

Thus, the estimate for the mean becomes:

K
a2k+1 a2k

1 X
µ̂Hist = vk − .
nh k=1 2 2

Rearranging, we can express the estimate as:

K
1 X
µ̂Hist = vk (ak+1 + ak ).
2n k=1

This is the desired expression for the estimated mean based on the histogram.

(iii) Estimating the Mean of the Distribution from the Data

1. Frequency Table:

The data set is given by the following frequency table:

Interval (5, 10] (10, 15] (15, 20] (20, 25] (25, 30] (30, 35]
Frequency 1 11 39 38 10 1
Group II Inferential statistics Exam 2017-2018

The bin midpoints are as follows:

- For (5, 10], midpoint = 5+10
2
= 7.5.
- For (10, 15], midpoint = 2 = 12.5.
10+15

- For (15, 20], midpoint = 15+20

2
= 17.5.
- For (20, 25], midpoint = 20+25
2
= 22.5.
- For (25, 30], midpoint = 25+30
2
= 27.5.
- For (30, 35], midpoint = 30+35
2
= 32.5.
The bin width is h = 5 for each interval.
Using the formula for the estimated mean:

6
1 X
µ̂Hist = vk · (ak+1 + ak ),
2n k=1

we substitute the frequency values and the midpoints:

1
µ̂Hist = (1(7.5) + 11(12.5) + 39(17.5) + 38(22.5) + 10(27.5) + 1(32.5)) ,
100
Simplifying:

1990
µ̂Hist = = 19.90.
100

Thus, the estimated mean of the distribution is 19.90 .

(i) Definition of an Unbiased Estimator

An estimator θ̂ for a parameter θ is said to be unbiased if its expected value is equal to

the true value of the parameter, i.e.,

E[θ̂] = θ.

In other words, the estimator does not systematically overestimate or underestimate the
true parameter value.

Context of the Exercise:

In this exercise, we have two independent samples X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Ym ,

each drawn from independent normal distributions N (µ, σ 2 ) and N (µ, τ 2 ), respectively,
where µ is the population mean, σ 2 is the population variance for the X’s, and τ 2 is the
population variance for the Y ’s.
Group II Inferential statistics Exam 2017-2018

The two estimators of the population mean µ are defined as follows:

n m
1X 1 X
X̄ = Xi and Ȳ = Yj .
n i=1 m j=1

These are sample means of sizes n and m, respectively.

1. Expected Value of X̄:

Since X1 , X2 , . . . , Xn are independent and follow the normal distribution N (µ, σ 2 ), the
expected value of X̄ is:
" n
# n n
1X 1X 1X
E[X̄] = E Xi = E[Xi ] = µ = µ.
n i=1 n i=1 n i=1

Thus, X̄ is an unbiased estimator of µ.

2. Expected Value of Ȳ :

Similarly, since Y1 , Y2 , . . . , Ym are independent and follow the distribution N (µ, τ 2 ), the
expected value of Ȳ is:
" m
# m m
1 X 1 X 1 X
E[Ȳ ] = E Yj = E[Yj ] = µ = µ.
m j=1 m j=1 m j=1

Therefore, Ȳ is also an unbiased estimator of µ.

(ii) Unbiasedness of X̄ and Ȳ , and Calculation of Their Variances

Unbiasedness of X̄ and Ȳ (Reminder):

We have already shown in part (i) that both X̄ and Ȳ are unbiased estimators of µ.
Specifically,

E[X̄] = µ and E[Ȳ ] = µ.

This follows from the fact that each of the Xi ’s and Yj ’s are independent and follow the
distributions N (µ, σ 2 ) and N (µ, τ 2 ), respectively. Thus, their sample means are unbiased
estimators of the population mean µ.

Calculation of the Variance of X̄:

The variance of the sample mean X̄ is given by the formula:

Group II Inferential statistics Exam 2017-2018

Var(Xi )
Var(X̄) = .
n
Since the Xi ’s are independent and follow N (µ, σ 2 ), the variance of each Xi is σ 2 . There-
fore, the variance of X̄ is:

σ2
Var(X̄) = .
n

Calculation of the Variance of Ȳ :

Similarly, the variance of the sample mean Ȳ is:

Var(Yj )
Var(Ȳ ) = .
m
Since the Yj ’s are independent and follow N (µ, τ 2 ), the variance of each Yj is τ 2 . Thus,
the variance of Ȳ is:

τ2
Var(Ȳ ) = .
m

Summary of Variances:
σ2 τ2
- The variance of X̄ is Var(X̄) = n
. - The variance of Ȳ is Var(Ȳ ) = m
.
σ2 τ2
Thus, both X̄ and Ȳ are unbiased estimators of µ, and their variances are n
and m
,
respectively.

(iii) Minimizing the Variance of the Combined Estimator

We are given that the combined estimator for the population mean µ is:

µ̂ = wX̄ + (1 − w)Ȳ ,

where X̄ = n1 ni=1 Xi and Ȳ = m1 m j=1 Yj .

P P

We are tasked with finding the value of w that minimizes the variance of µ̂.

1. Variance of µ̂:

Since X̄ and Ȳ are independent, the variance of µ̂ is:

Var(µ̂) = Var wX̄ + (1 − w)Ȳ = w2 Var(X̄) + (1 − w)2 Var(Ȳ ).

We know that:
Group II Inferential statistics Exam 2017-2018

σ2 τ2
Var(X̄) = , Var(Ȳ ) = .
n m
Therefore, the variance of µ̂ becomes:

σ2 τ2
Var(µ̂) = w2 + (1 − w)2 .
n m

2. Minimizing the Variance:

To minimize the variance, we take the derivative of Var(µ̂) with respect to w:

d σ2 τ2
Var(µ̂) = 2w − 2(1 − w) .
dw n m
Setting the derivative equal to zero:

σ2 τ2
2w − 2(1 − w) = 0.
n m
Simplifying:

σ2 τ2
w = (1 − w) .
n m
Expanding:

σ2 τ2 τ2
w = −w .
n m m
Collecting terms involving w:

σ2 τ 2 τ2

w + = .
n m m

Solving for w:

τ2
m
w= σ2 τ2
.
n
+ m

Multiplying the numerator and denominator by nm:

nτ 2
w= .
mσ 2 + nτ 2
Thus, the value of w that minimizes the variance of µ̂ is:

nτ 2
w= .
mσ 2 + nτ 2
Group II Inferential statistics Exam 2017-2018

3. Verifying that this is a Minimum:

To ensure that this value of w minimizes the variance, we calculate the second derivative
of the variance with respect to w.
The second derivative is:

d2 σ2 τ2
Var(µ̂) = 2 + 2 .
dw2 n m
σ2 τ2
Since both n
and m
are positive, we have:

d2 σ2 τ 2

Var(µ̂) = 2 + > 0.
dw2 n m

Therefore, the function Var(µ̂) is concave upwards, confirming that the value of w =
nτ 2
mσ 2 +nτ 2
indeed minimizes the variance.

(i) Definition of Type I and Type II Errors, and Significance Level

1. Type I Error: An error of type I occurs when we reject the null hypothesis H0 when
it is actually true. The probability of committing a type I error is denoted by α, which
is also called the significance level of the test.
2. Type II Error: An error of type II occurs when we fail to reject the null hypothesis
H0 when it is actually false (i.e., when H1 is true). The probability of committing a type
II error is denoted by β.
3. Significance Level α: The significance level α is the probability of rejecting the null
hypothesis when it is true. For example, a 5% significance level corresponds to a 5%
probability of making a type I error.

Hypothesis Testing: Goodness-of-Fit Test

In this exercise, we are performing a *Goodness-of-Fit Test* to check whether the pop-
ulation mean follows a specified value. A goodness-of-fit test compares observed data
against a hypothesized value for a population parameter, such as the mean or variance.
In this case, we are testing the population mean µ to see if it equals a specific value.
We are testing the following hypotheses:

H0 : µ = 13.5 versus H1 : µ ̸= 13.5

This means we want to test whether the population mean is 13.5, or if it differs from this
value (i.e., the mean could be either greater than or less than 13.5).
Now, we will proceed with the hypothesis tests for the following two cases:
Group II Inferential statistics Exam 2017-2018

• Part (ii): Test with known variance (σ 2 = 1).

• Part (iii): Test with unknown variance, using the sample estimate of variance.

(ii) Testing with Known Variance (σ 2 = 1) using t-Test

We are testing the hypotheses:

H0 : µ = 13.5 versus H1 : µ ̸= 13.5

Given that n = 8, ni=1 xi = 113.6627, and σ 2 = 1, we use a two-tailed t-test. Note that
P
since the sample size is small (n = 8), we use the t-distribution even though σ 2 is known.
This is because, for small sample sizes, the t-distribution is more appropriate than the
normal distribution, which assumes that the sample is large enough to approximate the
true distribution of the mean.
The test statistic is given by:

x̄ − µ0
t=
√σ
n

First, calculate the sample mean x̄:

n
1X 113.6627
x̄ = xi = = 14.2084
n i=1 8

Now, calculate the test statistic t:

14.2084 − 13.5 14.2084 − 13.5

t= = ≈ 2.00
√1 0.3536
8

The critical value for a two-tailed test at the 5% significance level with 7 degrees of
freedom is approximately ±2.3646. Since |t| = 2.00 is less than the critical value 2.3646,
we fail to reject the null hypothesis H0 at the 5% significance level.

(iii) Testing with Unknown Variance using t-Test

When σ 2 is unknown, we use a t-test. Note again that due to the small sample size
(n = 8), we continue to use the t-distribution. This is crucial because, for small samples,
the t-distribution accounts for the additional uncertainty in estimating the population
variance.
The test statistic is given by:

x̄ − µ0
t= s′
√
n

First, calculate the sample variance s′2 :

Group II Inferential statistics Exam 2017-2018

n Pn !
2
1 X ( i=1 xi )
s′2 = x2i −
n−1 i=1
n

(113.6627)2

′2 1 1
s = 1621.391 − = × 6.846 = 0.9779
7 8 7

Now, calculate the t-statistic:

14.2084 − 13.5 0.7084

t= 0.9888 = ≈ 2.03
√
8
0.349

The critical value for a two-tailed test at the 5% significance level with 7 degrees of
freedom is approximately ±2.3646. Since |t| = 2.03 is less than the critical value 2.3646,
we fail to reject the null hypothesis H0 at the 5% significance level.

(i) Definition of a Confidence Interval

An estimator I(X) = [a(X), b(X)] for a parameter θ is said to be a 100(1−α)% confidence
interval for θ if the following condition is satisfied:

P (a(X) ≤ θ ≤ b(X)) = 1 − α.

This means that, in repeated sampling, the interval [a(X), b(X)] will contain the true
value of the parameter θ in 1−α of the cases. The confidence level 100(1−α)% represents
the probability that the confidence interval will contain the true value of the parameter.
In other words, if we performed the same experiment multiple times and calculated the
interval I(X) each time, this interval would contain the true value of θ in 1 − α of the
trials.

(ii) Expression for E(θ̂) and Var(θ̂)

Let X̄1 = n1 ni=1 X1i and X̄2 = m1 m j=1 X2j , the sample means from independent sam-
P P

ples. The estimator of the parameter θ is defined as θ̂ = X̄1 − X̄2 , where X̄1 and X̄2 are
independent and follow normal distributions:

σ12 σ2
X̄1 ∼ N (µ1 , ) and X̄2 ∼ N (µ2 , 2 ).
n m

Thus, the expected value and variance of θ̂ = X̄1 − X̄2 are:

E[θ̂] = E[X̄1 ] − E[X̄2 ] = µ1 − µ2

Group II Inferential statistics Exam 2017-2018

σ12 σ22
Var(θ̂) = Var(X̄1 ) + Var(X̄2 ) = + .
n m

This estimator θ̂ follows a normal distribution:

σ12 σ22
θ̂ ∼ N (µ1 − µ2 , + ).
n m

(iii) 95% Confidence Interval for θ = µ1 − µ2

We are given the following data:

10
X 20
X
n = 10, m = 20, X1i = 96.08, X2j = 237.09.
i=1 j=1

From this, we calculate the sample means:

96.08 237.09
X̄1 = = 9.608, X̄2 = = 11.8545.
10 20
Next, we calculate the sample variances (which are now given as σ12 = 2 and σ22 = 4):

θ̂ = X̄1 − X̄2 = 9.608 − 11.8545 = −2.2465.

Now, to construct a 95% confidence interval for θ = µ1 − µ2 , we use the fact that the
sampling distribution of θ̂ is normal. Since the sample sizes n = 10 and m = 20 are
relatively small, we will use the Student’s t-distribution for the confidence interval.
The standard error of θ̂ is:
r
√
r
σ12 σ22 2 4 √
SE(θ̂) = + = + = 0.2 + 0.2 = 0.4 ≈ 0.6325.
n m 10 20
Using the t-distribution with ν = min(n − 1, m − 1) = min(10 − 1, 20 − 1) = 9 degrees
of freedom and a 95% confidence level, we look up the critical value t0.025,9 from the
t-distribution table, which is approximately 2.262.
Thus, the 95% confidence interval for θ = µ1 − µ2 is:

θ̂ ± t0.025,9 × SE(θ̂) = −2.2465 ± 2.262 × 0.6325.

The margin of error is:

2.262 × 0.6325 ≈ 1.429.

Thus, the 95% confidence interval is:

Group II Inferential statistics Exam 2017-2018

[−2.2465 − 1.429, −2.2465 + 1.429] = [−3.6755, −0.8175].

Therefore, the 95% confidence interval for µ1 − µ2 is approximately:

[−3.6755, −0.8175].

(iv) Is it plausible that µ1 = µ2 ?

To determine if it is plausible that µ1 = µ2 , we observe that the 95% confidence interval
for µ1 − µ2 is [−3.6755, −0.8175]. Since this interval does not contain 0, it is not plausible
that µ1 = µ2 at the 95% confidence level.

SECTION B

(i) Let’s show that:

n
X n
X
2
(Xi − X̄) = (Xi − µ)2 − n(X̄ − µ)2 .
i=1 i=1

Expanding the expression for the sum of squared deviations from the sample mean:
n
X n
X
2
(Xi − X̄) = (Xi − µ + µ − X̄)2 .
i=1 i=1

Using the binomial expansion:

(Xi − X̄)2 = (Xi − µ)2 + (µ − X̄)2 − 2(Xi − µ)(X̄ − µ).

Summing over i gives:

n
X n
X
2
(Xi − X̄) = (Xi − µ)2 − n(X̄ − µ)2 .
i=1 i=1

(ii) Let’s show that S 2 is an unbiased estimator of σ 2 :

The sample variance is defined as:
n
2 1 X
S = (Xi − X̄)2 .
n − 1 i=1
Group II Inferential statistics Exam 2017-2018

We know that: n n
X X
2
(Xi − X̄) = (Xi − µ)2 − n(X̄ − µ)2 .
i=1 i=1

Taking the expectation:

" n # n
!
1 X 1 X
E[S 2 ] = E (Xi − X̄)2 = E[(Xi − µ)2 ] − nE[(X̄ − µ)2 ] .
n−1 i=1
n−1 i=1

2
Since E[(Xi − µ)2 ] = σ 2 and E[(X̄ − µ)2 ] = σn , we get:

σ2

2 1 2
E[S ] = nσ − n = σ2.
n−1 n

Thus, S 2 is an unbiased estimator of σ 2 .

(iii) We are tasked with showing that the interval estimator:

!
2 2
(n − 1)S (n − 1)S
I(X) = ,
χ2n−1,1−α/2 χ2n−1,α/2
is a 100(1 − α)% confidence interval for σ 2 .
To do so, we use the following steps:
Step 1: Chi-Squared Distribution of Sample Variance
The sample variance S 2 follows a scaled chi-squared distribution:

(n − 1)S 2
∼ χ2n−1 ,
σ2
(n−1)S 2
which means that σ2
follows a chi-squared distribution with n−1 degrees of freedom.
Step 2: Confidence Interval Construction
To construct a confidence interval for σ 2 , we need to use the fact that the chi-squared
distribution is not symmetric, but its cumulative distribution function (CDF) gives us
the probability of the value falling within a particular range.
We want to find critical values corresponding to the desired confidence level. These
critical values are denoted as: - χ2n−1,α/2 , the critical value corresponding to the lower tail
of the distribution. - χ2n−1,1−α/2 , the critical value corresponding to the upper tail of the
distribution.
Step 3: Deriving the Confidence Interval
Using the properties of the chi-squared distribution, we can derive the confidence interval
for σ 2 as follows:
(n − 1)S 2

2 2
P χn−1,α/2 ≤ ≤ χn−1,1−α/2 = 1 − α.
σ2
Group II Inferential statistics Exam 2017-2018

Rearranging the inequality to isolate σ 2 , we obtain:

(n − 1)S 2 2 (n − 1)S 2
≤ σ ≤ .
χ2n−1,1−α/2 χ2n−1,α/2

This gives the desired 100(1 − α)% confidence interval for σ 2 .

(iv) Calculation of the 95% Confidence Interval for σ 2

Given:

• n = 10

•
Pn
i=1 xi = 104.334

•
Pn 2
i=1 xi = 1132.207

Step 1: Calculate the Sample Mean X̄

Pn
i=1 xi 104.334
X̄ = = = 10.4334
n 10
Step 2: Calculate the Sample Variance S 2
Using the formula for the sample variance:
n Pn !
2
1 X ( i=1 xi )
S2 = x2i −
n−1 i=1
n

Substitute the given values:

(104.334)2

2 1
S = 1132.207 −
9 10
(104.334)2
First, calculate 10
:
10886.8798
(104.334)2 = 10886.8798 ⇒ = 1088.68798
10
Now substitute this into the variance formula:
1
S2 = (1132.207 − 1088.68798)
9
1
S2 = (43.51902) = 4.8354
9
Step 3: Find the Critical Values from the Chi-Squared Distribution Table
For a 95% confidence interval, the critical values are taken from the chi-squared distri-
bution table with n − 1 = 9 degrees of freedom:

χ29,0.025 ≈ 2.700, χ29,0.975 ≈ 16.919

Group II Inferential statistics Exam 2017-2018

Step 4: Calculate the Confidence Interval for σ 2

The 95% confidence interval for σ 2 is given by:
(n − 1)S 2 (n − 1)S 2

,
χ29,0.975 χ29,0.025
Substitute the values:
9 × 4.8354 9 × 4.8354
,
16.919 2.700
[2.574, 16.129]

Thus, the 95% confidence interval for σ 2 is:

[2.574, 16.129]

(i) Likelihood Function

We are given that X1 , X2 , . . . , Xn are independent random samples from a discrete dis-
tribution with the probability mass function (PMF):

pX (x) = (1 − p)x p, for x = 0, 1, 2, . . .

where p ∈ [0, 1] is the unknown parameter.
The likelihood function L(p) is the product of the PMF for each sample, given by:

n
Y
L(p) = pX (Xi )
i=1

Substituting the given expression for pX (x), we get:

n
Y
L(p) = (1 − p)Xi p
i=1

Since the product involves n terms, we can simplify the expression as:

n
Y
n
L(p) = p (1 − p)Xi
i=1

This can be rewritten as:

Pn
L(p) = pn (1 − p) i=1 Xi

Thus, the likelihood function is:

Pn
L(p) = pn (1 − p) i=1 Xi
Group II Inferential statistics Exam 2017-2018

(ii) Maximum Likelihood Estimator (Continued)

From Part (i), we obtained the likelihood function:
Pn
L(p) = pn (1 − p) i=1 Xi

We want to maximize this likelihood function with respect to p. We took the log-likelihood
function:

n
!
X
ℓ(p) = log(L(p)) = n log(p) + Xi log(1 − p)
i=1

Taking the derivative of ℓ(p) with respect to p and setting it to 0, we found that:
Pn
n Xi
i=1
=
p 1−p
Cross-multiplying and simplifying:

n
X
n − np = p Xi
i=1

n
!
X
n=p n+ Xi
i=1

Solving for p:

n
p= Pn
n+ i=1 Xi

Now, we can rewrite the sum as:

n
X
Xi = nX̄
i=1

Substituting this into the equation for p:

n 1
p= =
n + nX̄ 1 + X̄
Therefore, the maximum likelihood estimator for p is:

1
p̂ =
1 + X̄
Group II Inferential statistics Exam 2017-2018

1−b 1−a

(iii) Showing the Relation Between P (a < p < b) and P b < X̄ < a

We want to show that:

1−b 1−a
P (a < p < b) = P < X̄ <
b a

From Part (ii), we know that the maximum likelihood estimator for p is given by:

1
p̂ =
1 + X̄
Now, we want to compute the probability P (a < p < b). Using the relationship between
p and p̂, we have:

1
P (a < p < b) = P a< <b
1 + X̄

We will now solve this inequality for X̄.

1. Invert the inequality:

1 1
< 1 + X̄ <
b a
2. Subtract 1 from both sides:

1 1
− 1 < X̄ < − 1
b a
3. Simplify each side:

1−b 1−a
< X̄ <
b a
Thus, we have:

1−b 1−a
P (a < p < b) = P < X̄ <
b a

Since X̄ is the sample mean, this relation allows us to calculate the probability for p in
terms of the sample mean X̄.

(iv) - Calculation of the Approximate Probability

We are given n = 100 and p = 0.25, and we are asked to calculate the approximate
probability that 0.22 < p < 0.26.
From the given information, the expectation and variance of a single observation X1 are:
Group II Inferential statistics Exam 2017-2018

1−p 1 − 0.25
E(X1 ) = = =3
p 0.25
1−p 1 − 0.25
Var(X1 ) = 2
= = 0.12
p (0.25)2

Thus, the expectation and variance of the sample mean X̄ are:

1−p 0.12
E(X̄) = = 3, Var(X̄) = = 0.12
p 100

The standard deviation of X̄ is:

√
SD(X̄) = 0.12 ≈ 0.3464

Now,using the result from part (iii), we can express the probability P (0.22 < p̂ < 0.26)
as:

1 − 0.26 1 − 0.22
P (0.22 < p̂ < 0.26) = P < X̄ < .
0.26 0.22

Simplifying the bounds:

0.74 0.78
P < X̄ < = P (2.8462 < X̄ < 3.5455).
0.26 0.22

Now, we standardize the bounds using the Z-score formula:

X̄ − E(X̄)
Z= .
SD(X̄)

Given that E(X̄) = 3 and SD(X̄) = 0.3464, we compute the Z-scores:

For X̄ = 2.8462:

2.8462 − 3
Zlower = ≈ −0.444.
0.3464
For X̄ = 3.5455:

3.5455 − 3
Zupper = ≈ 1.576.
0.3464
Using the standard normal distribution table, we find:

P (Z < 1.576) ≈ 0.94235, P (Z < −0.444) ≈ 0.3282.

Thus, the probability is:

Group II Inferential statistics Exam 2017-2018

P (−0.444 < Z < 1.576) = 0.94235 − 0.3282 = 0.61415.

Therefore, the approximate probability that 0.22 < p̂ < 0.26 is:

0.61415 .

(i)
Consider a clinical study on a new treatment for Rhinovirus with n patients. Each
patient has a probability p of recovering, independently of other patients. Let Xi denote
the indicator variable for the i-th patient’s recovery:
(
1 if the i-th patient recovers
Xi =
0 if the i-th patient does not recover

We are testing the following hypotheses:

H0 : p = p0 vs H1 : p > p0

We are interested in finding the expression for the sample proportion p̂ and its approxi-
mate distribution under the null hypothesis H0 .

Estimator of the Recovery Proportion

The sample proportion p̂ is simply the average of the Xi ’s, i.e., the proportion of patients
that recover in the sample. The mathematical expression for the sample proportion is:

n
1X
p̂ = Xi
n i=1

This quantity p̂ represents the observed proportion of recovered patients in the sample.

Approximate Distribution of p̂ under H0

Under the null hypothesis H0 : p = p0 , each Xi follows a Bernoulli distribution with

parameter p0 , i.e.,

Xi ∼ Bernoulli(p0 )

The properties of the Bernoulli distribution are:

Group II Inferential statistics Exam 2017-2018

E[Xi ] = p0 and Var(Xi ) = p0 (1 − p0 )

The sample proportion p̂ is the mean of the Xi ’s:

n
1X
p̂ = Xi
n i=1

By the law of large numbers, p̂ converges to p0 as n increases:

E[p̂] = p0

Now, applying the **Central Limit Theorem (CLT)**, which states that the sample
mean of a large number of independent and identically distributed random variables will
approximate a normal distribution, we find that for large n, the sample proportion p̂
approximately follows a normal distribution:

p0 (1 − p0 )
p̂ ∼ N p0 ,
n

Thus, for large n, the distribution of p̂ is approximately normal with mean p0 and variance
p0 (1−p0 )
n
.

Conclusion

In summary, the sample proportion of recovery is given by:

n
1X
p̂ = Xi
n i=1

Under the null hypothesis H0 : p = p0 , for large n, the approximate distribution of p̂ is:

p0 (1 − p0 )
p̂ ∼ N p0 ,
n

This result allows us to use the normal approximation for hypothesis testing and confi-
dence intervals regarding the recovery proportion p.

Define the following notation:

p̂ − p0 p̂ − p0
Z1 = q and Z2 = q
p0 (1−p0 ) p̂(1−p̂)
n n

The usual way of testing H0 vs H1 is to reject H0 if Z1 > zα , where zα is the upper α-

quantile of the standard normal distribution N (0, 1). This gives a test with significance
level α1 , and for large n, α1 ≈ α.
Group II Inferential statistics Exam 2017-2018

Another way of testing H0 vs H1 is to reject H0 when Z2 > zα . This gives a test with
significance level α2 , and for large n, α2 ≈ α. Usually, α1 approximates α more closely
than α2 .
It is known that under the testing procedure based on Z2 , H0 is rejected if and only if
p̂ > γ, where:
√
z2 zα2

−b + b2 − 4ac
γ= , with a = 1 + α , b = − 2p0 + , c = p20 .
2a n n

(ii) Rejection Probability in Terms of γ

We want to find the probability of rejecting H0 based on the condition p̂ > γ, where γ is
determined by the quadratic equation.

Step 1: Initial Expression for the Probability

Under the hypothesis test, we reject H0 if p̂ > γ, where γ is defined by the quadratic
formula. The probability of rejecting H0 is:
 
γ−p 
P (p̂ > γ) = P Z > q ,
p(1−p)
n

where Z ∼ N (0, 1) is a standard normal random variable.

Step 2: Simplification of the Expression

To simplify this expression and move the square root of n to the

√ numerator, we manipulate
the fraction inside the probability. Multiply and divide by n to get:

γ−p γ−p √
q =p · n.
p(1−p) p(1 − p)
n

Step 3: Final Expression for the Probability

Thus, the probability of rejecting H0 becomes:

!
γ−p √
P (p̂ > γ) = P Z>p · n .
p(1 − p)

Using the cumulative distribution function Φ of the standard normal distribution, we can
write the probability as:
Group II Inferential statistics Exam 2017-2018

!
γ−p √
P (p̂ > γ) = 1 − Φ p · n .
p(1 − p)

Conclusion

The probability of rejecting H0 under the alternative hypothesis p > p0 is given by:

√ !
n(γ − p)
P (p̂ > γ) = 1 − Φ p .
p(1 − p)

This expression provides the desired form where the square root of n appears in the
numerator, as requested.

(iii) Rejection Probability Based on Z2

We want to compute the approximate probability of rejecting H0 when using the proce-
dure based on Z2 with the given parameters.

Given Parameters

- Significance level α = 0.05,

- Sample size n = 200,
- Hypothesized proportion p0 = 0.3,
- True proportion p = 0.35,
- zα = −1.6449 (critical value for α = 0.05 in the normal distribution table).
We use the following expression for the probability of rejecting H0 based on Z2 :
!
γ−p √
P (p̂ > γ) = 1 − Φ p · n ,
p(1 − p)

where γ is the threshold value determined by the quadratic formula.

Step 1: Compute γ

The threshold value γ is given by the solution of the quadratic equation:

√
−b + b2 − 4ac
γ= ,
2a
where:
2
- a=1+ zα
n
,
Group II Inferential statistics Exam 2017-2018

2

- b = − 2p0 + zα
n
,

- c = p20 .
Substituting the known values:
- zα = −1.6449,
- p0 = 0.3,
- n = 200,
we compute:

(−1.6449)2 2.702
a=1+ =1+ = 1 + 0.01352 = 1.01352,
200 200

(−1.6449)2

b = − 2(0.3) + = − (0.6 + 0.01352) = −0.61352,
200

c = (0.3)2 = 0.09.

Now, substitute into the quadratic formula:

p
−(−0.61352) + (−0.61352)2 − 4(1.01352)(0.09)
γ= .
2(1.01352)

First, calculate the discriminant:

(−0.61352)2 = 0.3764, 4(1.01352)(0.09) = 0.3648, discriminant = 0.3764−0.3648 = 0.0116.

Thus,
√
0.61352 + 0.0116 0.61352 + 0.1077 0.72122
γ= = = ≈ 0.3557.
2(1.01352) 2.02702 2.02704

So, γ ≈ 0.3557.

Step 2: Compute the Probability of Rejecting H0

Now that we have γ ≈ 0.3557, we compute the probability of rejecting H0 when the true
proportion is p = 0.35.
The probability of rejecting H0 is:
!
γ−p √
P (p̂ > γ) = 1 − Φ p · n .
p(1 − p)

Substituting the known values:

Group II Inferential statistics Exam 2017-2018

!
0.3557 − 0.35 √
P (p̂ > γ) = 1 − Φ p · 200 .
0.35(1 − 0.35)

First, calculate the term inside the square root:

p √ √
0.35(1 − 0.35) = 0.35 × 0.65 = 0.2275 ≈ 0.4769.

Now calculate:

0.355 − 0.35 √ 0.005 √

· 200 = × 200 ≈ 0.1482.
0.4769 0.4769
Thus, the probability is:

P (p̂ > γ) = 1 − Φ(0.147).

From the standard normal distribution, Φ(0.1482) ≈ 0.5588, so:

P (p̂ > γ) = 1 − 0.5588 = 0.4412.

Conclusion

The approximate probability of rejecting H0 using the procedure based on Z2 is:

0.4412 .

This means that there is approximately **44.12%** chance of rejecting H0 when the true
proportion is p = 0.35, p0 = 0.3, and the sample size is n = 200, with a significance level
of α = 0.05.

(iv) Rejection Condition for p

We are given that the test based on Z2 rejects the null hypothesis H0 if and only if Z2 > zα ,
where zα is the critical value of the standard normal distribution corresponding to the
significance level α. The goal of this proof is to show that this condition is equivalent to
rejecting H0 if and only if p̂ > γ, where γ is a specific threshold that depends on p0 , n,
and zα .

Step 1: Expression for Z2

Recall the expression for the test statistic Z2 :

p̂ − p0
Z2 = q ,
p̂(1−p̂)
n
Group II Inferential statistics Exam 2017-2018

where:
- p̂ is the sample proportion,
- p0 is the hypothesized proportion under the null hypothesis,
- n is the sample size.
Under the testing procedure, we reject H0 if and only if Z2 > zα . This can be rewritten
as:

Z22 > zα2 .

Step 2: Setting Up the Rejection Criterion

To find the equivalent condition in terms of p̂, we start with the expression for Z22 :
 2
p̂ − p0  (p̂ − p0 )2
Z22 =  q = p̂(1−p̂) .
p̂(1−p̂)
n n

Rewriting the rejection condition Z22 > zα2 , we get:

(p̂ − p0 )2
p̂(1−p̂)
> zα2 .
n

Multiplying both sides by p̂(1−p̂)

n
, we obtain:

p̂(1 − p̂)
(p̂ − p0 )2 > zα2 · .
n

Step 3: Solving the Equation

Next, we solve for p̂ by considering the equality case. We set the inequality to equality
to find the critical threshold γ:

p̂(1 − p̂)
(p̂ − p0 )2 = zα2 · .
n
Expanding both sides:

p̂(1 − p̂)
p̂2 − 2p0 p̂ + p20 = zα2 · .
n
Now, multiply out the right-hand side:

zα2
p̂2 − 2p0 p̂ + p20 = (p̂ − p̂2 ).
n
Rearranging all terms involving p̂ to one side:
Group II Inferential statistics Exam 2017-2018

zα2 z2
p̂2 − 2p0 p̂ + p20 = p̂ − α p̂2 .
n n
Now, group the terms involving p̂2 together:

zα2 2 z2
p̂2 + p̂ − 2p0 p̂ + p20 = α p̂.
n n

zα2 zα2

2
1+ p̂ − 2p0 + p̂ + p20 = 0.
n n

This is a quadratic equation in p̂, which we solve using the quadratic formula:
√
−b ± b2 − 4ac
p̂ = ,
2a
where:
2
- a = 1 + znα ,
2

- b = − 2p0 + zα
n
,

- c = p20 .
Substituting these values into the quadratic formula, we obtain:
r 2
2
zα 2 2
2p0 + n
± 2p0 + znα − 4 1 + zα
n
p20
p̂ = 2
.
2 1 + znα

The positive root of this quadratic equation corresponds to the critical value γ, so we
define:
√
−b + b2 − 4ac
γ= .
2a
Thus, the rejection criterion is:

H0 is rejected if and only if p̂ > γ.

Conclusion

We have shown that under the testing procedure based on Z2 , H0 is rejected if and only
if p̂ > γ, where γ is given by the quadratic formula:
√
−b + b2 − 4ac
γ= ,
2a
with the coefficients:
Group II Inferential statistics Exam 2017-2018

2
- a = 1 + znα ,
2

- b = − 2p0 + zα
n
,

- c = p20 .
This shows that the test based on Z2 is equivalent to rejecting H0 when p̂ > γ, as required.

Mth 216 Statistical Inference 2
No ratings yet
Mth 216 Statistical Inference 2
57 pages
Critical Appraisal Worksheet
No ratings yet
Critical Appraisal Worksheet
1 page
Chapter 6
No ratings yet
Chapter 6
43 pages
Reference1 Harvard
No ratings yet
Reference1 Harvard
118 pages
HW 7 Solutions
No ratings yet
HW 7 Solutions
7 pages
Topic 06 Estimation
No ratings yet
Topic 06 Estimation
41 pages
(11)chi-square
No ratings yet
(11)chi-square
19 pages
AllNotes-4 (2)
No ratings yet
AllNotes-4 (2)
56 pages
Estimation of Parameters
No ratings yet
Estimation of Parameters
12 pages
Hypothesis notes 1
No ratings yet
Hypothesis notes 1
88 pages
Brousseau (2001) An Experiment On The Teaching of Statistics and Probability
No ratings yet
Brousseau (2001) An Experiment On The Teaching of Statistics and Probability
49 pages
webMATH236_Lecture5(1)
No ratings yet
webMATH236_Lecture5(1)
87 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
9 pages
Solution Work 1.ea
No ratings yet
Solution Work 1.ea
31 pages
Topic_10_Point_estmation_of_parameters
No ratings yet
Topic_10_Point_estmation_of_parameters
36 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
SI_Chapter-2
No ratings yet
SI_Chapter-2
53 pages
UNIT-3 (ESTIMATION)
No ratings yet
UNIT-3 (ESTIMATION)
16 pages
Problem Set 1 With Answers
No ratings yet
Problem Set 1 With Answers
6 pages
PP 01 Soln
No ratings yet
PP 01 Soln
10 pages
TP_stat_inf_103957
No ratings yet
TP_stat_inf_103957
32 pages
A Level Psychology Linear Revision Flashcards Core Topics AQA Sample
No ratings yet
A Level Psychology Linear Revision Flashcards Core Topics AQA Sample
16 pages
Adrian C. Newton - Forest Ecology and Conservation - A Handbook of Techniques
100% (1)
Adrian C. Newton - Forest Ecology and Conservation - A Handbook of Techniques
471 pages
Note3 CHAPTER2
No ratings yet
Note3 CHAPTER2
15 pages
06 Bsc Mathematics
No ratings yet
06 Bsc Mathematics
31 pages
2nd Year Statistics Question Bank CH#12
No ratings yet
2nd Year Statistics Question Bank CH#12
3 pages
2nd Year Stat ch.12 Test
No ratings yet
2nd Year Stat ch.12 Test
1 page
Chapter 3 HT
No ratings yet
Chapter 3 HT
20 pages
S1B 15 02 Estimation Bias 4
No ratings yet
S1B 15 02 Estimation Bias 4
2 pages
Ch10 Experimental Design Statistical Analysis of Data
No ratings yet
Ch10 Experimental Design Statistical Analysis of Data
38 pages
Psp-Unit-6 Estimation Theory PDF
No ratings yet
Psp-Unit-6 Estimation Theory PDF
38 pages
I3 Sta2 MX 25122019
No ratings yet
I3 Sta2 MX 25122019
7 pages
Controlled Clinical Trials Volume 5 Issue 2 1984 (Doi 10.1016/0197-2456 (84) 90116-8) William C. Blackwelder Marie A. Chang - Sample Size Graphs For "Proving The Null Hypothesis"
No ratings yet
Controlled Clinical Trials Volume 5 Issue 2 1984 (Doi 10.1016/0197-2456 (84) 90116-8) William C. Blackwelder Marie A. Chang - Sample Size Graphs For "Proving The Null Hypothesis"
9 pages
Tutorial Sheet 1
No ratings yet
Tutorial Sheet 1
3 pages
The Life-Cycle Benefits of An Influential Early Childhood Program
No ratings yet
The Life-Cycle Benefits of An Influential Early Childhood Program
59 pages
Intentional Observer Effects On Quantum Randomness: A Bayesian Analysis Reveals Evidence Against Micro-Psychokinesis
No ratings yet
Intentional Observer Effects On Quantum Randomness: A Bayesian Analysis Reveals Evidence Against Micro-Psychokinesis
11 pages
STAT 101 Module Handout 4.1
No ratings yet
STAT 101 Module Handout 4.1
12 pages
Statistics
No ratings yet
Statistics
53 pages
ES12010 - Extra Exercise WK25 Solutions
No ratings yet
ES12010 - Extra Exercise WK25 Solutions
13 pages
Solutions Exercises Chapter 2: Dependence)
No ratings yet
Solutions Exercises Chapter 2: Dependence)
3 pages
Lecture7 Estimation
No ratings yet
Lecture7 Estimation
18 pages
Problem Set 1
No ratings yet
Problem Set 1
3 pages
IIIT-B & UpGrad PG Diploma Program-Curriculum
No ratings yet
IIIT-B & UpGrad PG Diploma Program-Curriculum
7 pages
1 Assignment Solution Submision
No ratings yet
1 Assignment Solution Submision
3 pages
ANSWER SHEET (Module 5) .: Z-Test: Two Sample For Means
No ratings yet
ANSWER SHEET (Module 5) .: Z-Test: Two Sample For Means
4 pages
Chapter 3 - Statistical Inference (Point Estimation
No ratings yet
Chapter 3 - Statistical Inference (Point Estimation
15 pages
Hypothesis Test Statistics and Confidence Intervals
No ratings yet
Hypothesis Test Statistics and Confidence Intervals
2 pages
Final Study Guide
No ratings yet
Final Study Guide
16 pages
Chapters4 5 PDF
No ratings yet
Chapters4 5 PDF
96 pages
Chapters4 5 PDF
No ratings yet
Chapters4 5 PDF
96 pages
Module 5
No ratings yet
Module 5
67 pages
Comparison of Goodness of Fit Tests For PDF
No ratings yet
Comparison of Goodness of Fit Tests For PDF
32 pages
Practice Questions 1
No ratings yet
Practice Questions 1
2 pages
Chapter 9 Sample Estimation Problems: Classical Methods of Estimation Point Estimation
No ratings yet
Chapter 9 Sample Estimation Problems: Classical Methods of Estimation Point Estimation
8 pages
PG Program in Data Science
No ratings yet
PG Program in Data Science
11 pages
OU MBA Syllabus
No ratings yet
OU MBA Syllabus
60 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
PSet 7
No ratings yet
PSet 7
2 pages
Stimation: Statistic
No ratings yet
Stimation: Statistic
46 pages
Chapter 6 Statistical Estimation Method of Moments MLE
No ratings yet
Chapter 6 Statistical Estimation Method of Moments MLE
29 pages
2009 Unbiased Estimate & Confidence Interval
0% (1)
2009 Unbiased Estimate & Confidence Interval
8 pages
Statistical Inference
No ratings yet
Statistical Inference
35 pages
Tests of Significance
No ratings yet
Tests of Significance
36 pages
DR Saiful's Notes On Medical & Allied Health Education - Statistics & Research Methodology
100% (1)
DR Saiful's Notes On Medical & Allied Health Education - Statistics & Research Methodology
117 pages
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
No ratings yet
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
9 pages
Introduction To Discrete Bayesian Methods: Petri Nokelainen
No ratings yet
Introduction To Discrete Bayesian Methods: Petri Nokelainen
146 pages
Chap 10
No ratings yet
Chap 10
7 pages
ST102 Notes
0% (1)
ST102 Notes
21 pages
Chapter 8
No ratings yet
Chapter 8
42 pages
Problem Set 1 - Answers
No ratings yet
Problem Set 1 - Answers
7 pages
Sample Statistics: N I N I
No ratings yet
Sample Statistics: N I N I
13 pages
Statistical+Inference+1 Shaw2007
No ratings yet
Statistical+Inference+1 Shaw2007
66 pages
BBIO105 Statistics Handouts Module 5A (Introduction To Inferential Statistics & Estimation)
No ratings yet
BBIO105 Statistics Handouts Module 5A (Introduction To Inferential Statistics & Estimation)
8 pages
Visvesvaraya Technological University, Belagavi: VTU-ETR Seat No.: A
No ratings yet
Visvesvaraya Technological University, Belagavi: VTU-ETR Seat No.: A
48 pages
Introduction To Estimation: OPRE 6301
100% (1)
Introduction To Estimation: OPRE 6301
18 pages
Tutorial On Chapters 7-8-9: Probability and Statistics For Engineers Geng 200
No ratings yet
Tutorial On Chapters 7-8-9: Probability and Statistics For Engineers Geng 200
17 pages
Basic Probability Reference Sheet: February 27, 2001
No ratings yet
Basic Probability Reference Sheet: February 27, 2001
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
98 pages
Ken Black QA 5th Chapter 11 Solution
No ratings yet
Ken Black QA 5th Chapter 11 Solution
30 pages
Quntitative Research
100% (2)
Quntitative Research
20 pages
Quanti Finals Zara 2
No ratings yet
Quanti Finals Zara 2
9 pages
Analysis of Variance (F-Test) : Name: Jonalyn M. Cerilo Subject: Statistics Professor: Dr. Maria Dela Vega Topic
No ratings yet
Analysis of Variance (F-Test) : Name: Jonalyn M. Cerilo Subject: Statistics Professor: Dr. Maria Dela Vega Topic
5 pages
Chapter 6 Solutions Solution Manual Introductory Econometrics For Finance
No ratings yet
Chapter 6 Solutions Solution Manual Introductory Econometrics For Finance
11 pages
English10 - Mod1.2 - Distinguish Technical Terms Used in Research - Final
100% (1)
English10 - Mod1.2 - Distinguish Technical Terms Used in Research - Final
17 pages
M Phil Clinical Psychology
No ratings yet
M Phil Clinical Psychology
41 pages
3 Activities Zach
67% (3)
3 Activities Zach
3 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet