0% found this document useful (0 votes)
25 views

4.1. Estimation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

4.1. Estimation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Estimation

Biostatistics Course 2021-2022


Block 4 – Session 4.1

Ali Lateef Jasim


MBChB.
Learning objectives

❑ Define statistical inference.


❑ Explain the concept of estimation.
❑ List and define the types of estimation; point and
interval estimation.
❑ Define and appraise the importance of confidence
interval.
❑ Practice some examples.
Statistical Inference

❖ It is the procedure where inference about a population is


made on the basis of the results obtained from a sample
drawn from that population.
❖ This can be achieved by :
A. Hypothesis testing
B. Estimation:
✓ Point estimation
✓ Interval estimation
Estimation
❖ If the mean and the variance of a normal distribution
are known, then the probabilities of various events can
be determined.
❖ But almost always these values are not known , and we
have to estimate these numerical values from
information of a simple random sample.
❖ The process of estimation involves calculating from the
data of a sample, some “statistic” which is an
approximation of the corresponding “parameter” of the
population from which the sample was drawn.
Point Estimation
❖ It is a single numerical value obtained from a random
sample used to estimate the corresponding population
parameter.
❖ Sample mean (X) is the best point estimate for
population mean (µ).
❖ Sample standard deviation (S) is the best point estimate
for population standard deviation (σ).
❖ Sample proportion (P) is the best point estimator for
population proportion (P).
Point Estimation

❖ But, there is always a sort of sampling error that can be


measured by the Standard Error of the mean which
relates to the precision of the estimated mean.
❖ Because of sampling variation we can not say that the
exact parameter value is some specific number.
❖ But we can determine a range of values within which
we are confident the unknown parameter lies.
Interval Estimation

❖ It consists of two numerical values defining an interval


within which lies the unknown parameter we want to
estimate with a specified degree of confidence.
❖ The values depend on the confidence level which is
equal to 1-α (α is the probability of error)
❖ The interval estimate may be expressed as:

Estimator ± Reliability coefficient(Z) X standard error


Equations for SE
Parameter < Estimator < Standard error

A. Population mean (µ) < Sample mean (X) <


SE= σ /√ n

B. Difference between two population means


(µ1-µ2) < Difference between two sample means
σ σ
(X1-X2) < SE=√( 12/n1) + ( 22/n2)
Equations for SE
Parameter < Estimator < Standard error

C. Population proportion (P) < Sample proportion


(P) < SE= √p(1-p)/n

D. Difference between two Population proportions


(P1-P2) < Difference between two samples
proportions (P1-P2) < SE= √p1(1-p1)/n + p2(1-
p2)/n
Reliability Coefficient
❖ Is the value of Z 1-α /2 corresponding to the confidence
level.
Confidence Interval
❖ The Confidence Interval is central and symmetric around
the sample mean , so that there is (α/2 %) chance that
the parameter is more than the upper limit, and (α/2 %)
chance that it is less than the lower limit.
❖ The width of the interval estimation is increased by:
✓ Increasing confidence level (i.e.: decreasing alpha
value).
✓ Decreasing sample size.
Confidence Interval
Confidence level can shade the light on the following
information:
1. The range within which the true value of the estimated
parameter lies.
2. The statistical significance of a difference (in population
means or proportions).
If the ZERO value is included in the interval of such differences
(i.e.: the range lies between a negative value and a positive
value), then we can state that there is no statistically significant
difference between the two population values (parameters),
although the sample values (statistics) showed a difference.
Confidence Interval
Confidence level can shade the light on the following
information:
3. The sample size.
✓ A narrow interval indicates a “large” sample size.
✓ While a wide interval indicates a “small” sample size
(with fixed confidence level).
Single Mean
Exercise 1.
The mean Serum indirect bilirubin level of 16 four-days-old
infants was found to be 5.98 mg/dl. The population SD (σ) =
3.5 mg/dl assuming normality , find 90,95, 99% CI for µ.
Answer:
The interval estimate = Estimator (statistic) ± Reliability
coefficient(Z) * standard error (SE)
1. Sample mean (estimator) = 5.98 mg/dl
Population SD = 3.5 mg/dl
Standard error (for a single mean)(SE) = σ /√ n
2. Standard error = 3.5 /√ 16 = 3.5 / 4 = 0.875
Exercise 1.
3. Reliability coefficient (Z) = according to the level of
confidence :
For 90% CI; Z= 1.645
For 95% CI; Z= 1.96
For 99% CI; Z= 2.58

A. For 90% CI for µ = X ± [Z * SE] = 5.98 ± (1.645 * 0.875)


= 5.98 ± 1.44
So, Confidence interval is (4.54 – 7.42).
Exercise 1.
B. For 95% CI for µ = X ± [Z * SE] = 5.98 ± (1.96 * 0.875)
= 5.98 ± 1.715
So, Confidence interval is (4.265 – 7.695).

C. For 99% CI for µ = X ± [Z * SE] = 5.98 ± (2.58 * 0.875)


= 5.98 ± 2.26
So, Confidence interval is (3.72 – 8.24).
Exercise 1.
What happened to the CI on increasing the confidence
level?
✓ 90% CI = (4.54 – 7.42).
✓ 95% CI = (4.265 – 7.695).
✓ 99% CI = (3.72 – 8.24).

We can notice that as the confidence level is increased


(lowering alpha level of the estimation) the Confidence
Interval width increases (more values are included).
Difference between
two means
Exercise 2.
A sample of 10 twelve-year old boys and a sample of 10
twelve-year old girls yielded mean height of 59.8 inches
(boys), and 58.5 inches (girls). Assuming normality and σ1=2
inches, and σ2= 3 inches . Find 90% CI for the difference in
means of height between girls and boys at this age.
Answer:
We can see that we have 2 samples, 2 means and 2
standard deviations, so we have to use the equation for
difference between two means.
The interval estimate = Estimator (statistic) ± Reliability
coefficient (Z) * standard error (SE)
Exercise 2.
1. ( X boys – X girls ) (estimator) = 59.8 - 58.5 = 1.3
2. Standard Error (SE X boys – X girls )
= √( σ12/n1) + (σ22/n2)
= √ ((2)2 / 10) + ((3)2 / 10)
= √ ( 4 / 10 ) + ( 9 / 10 ) = √ 0.4 + 0.9 = √ 1.3 = 1.14
3. Reliability coefficient (Z) for 90% CI = 1.645

90% CI for µboys- µgirls = 1.3 ± ( 1.14 * 1.645)


= 1.3 ± 1.875 = ( -0.575 – 3.175)
Exercise 2.

Since ZERO is included in the interval there is


no statistically significant difference between
the two population means.
Single Proportion
Exercise 3.
In a survey 300 adults were interviewed, 123 said they had
yearly medical checkup. Find the 95% for the true
proportion of adults having yearly medical checkup.
Answer:
We have a sample (300) and a proportion of them (123)
did the check ups, so we have to use the equation of single
proportion.
The interval estimate = Estimator (statistic) ± Reliability
coefficient (Z) * standard error (SE)
Exercise 3.
1. To calculate the proportion (The estimator) = P = 123/300
= 0.41
2. Standard Error (SE) for P = √P(1-P)/n
= √ 0.41 (1-0.41) / 300
= √ 0.41 (0.59) /300 = √ 0.242 / 300
= √ 0.0008 = 0.028
3. Reliability coefficient (Z) for 95% CI = 1.96
95% CI for P = P ± (Z*SE) = 0.41 ± (1.96 * 0.028)
= 0.41 ± 0.055 = ( 0.355 – 0.465 )
Difference between
two proprtions
Exercise 4.
200 patients suffering from a certain disease were
randomly divided into two equal groups. The first group
received NEW treatment, 90 recovered in three days. Out
of the other 100 who received the STANDARD treatment
78 recovered within three days. Find the 95% CI for the
difference between the proportion of recovery among the
populations receiving the two treatments.
Answer:
We have 2 groups (samples) and we have the proportions
of who recovered from these 2 groups, so we need to use
the equation of difference between 2 proportions.
Exercise 4.
The interval estimate = Estimator (statistic) ± Reliability
coefficient (Z) * standard error (SE)
1. The estimator = the difference between 2 proportions
= P1 - P2 = 90/100 – 78/100 = 0.9 – 0.78 = 0.12
2. Standard Error (SE) for P1-P2 =
= √p1 (1 - p1) / n1 + p2 (1 - p2) / n2
= √ 0.9*(1-0.9) / 100 + 0.78*(1-0.78) / 100
= √ 0.0009 + 0.001716 = √ 0.002616 = 0.05
Exercise 4.
3. Reliability coefficient (Z) for 95% CI = 1.96
95% CI = 0.12 ± 1.96*0.05 = 0.12 ± 0.1 = (0.02 – 0.22)

Since ZERO is not included in the interval there is


statistically significant difference between the
two population proportion.
Thank You
Ali Lateef Jasim
MBChB.

You might also like