0% found this document useful (0 votes)
28 views12 pages

Stat ch2

This document discusses statistical methods for estimating population parameters from sample data, specifically the population mean. It covers point estimates, interval estimates, and confidence intervals. When sample sizes are large, the normal distribution can be used to construct confidence intervals for the population mean using the z-statistic. For small sample sizes with an unknown population standard deviation, the t-distribution must be used instead of the normal. The t-distribution approaches the normal distribution as sample size increases.

Uploaded by

Solomon Melese
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views12 pages

Stat ch2

This document discusses statistical methods for estimating population parameters from sample data, specifically the population mean. It covers point estimates, interval estimates, and confidence intervals. When sample sizes are large, the normal distribution can be used to construct confidence intervals for the population mean using the z-statistic. For small sample sizes with an unknown population standard deviation, the t-distribution must be used instead of the normal. The t-distribution approaches the normal distribution as sample size increases.

Uploaded by

Solomon Melese
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

Statistics for management

Chapter two
Statistical Inferences; Estimating for single populations
Estimating Population Mean with Large sample size

On many occasions estimating the population mean is useful in business research. A point
estimate is a statistic taken from a sample and is used to estimate a population parameter.
However, a point estimate is only as good as the representative ness of its sample. If other
random sample are taken from the population. The point estimates derived from those
samples are likely to vary. Because of variation in sample statistics, estimating a population
parameter with an interval estimate is often preferable to using point estimate. An interval
estimate (confidence interval) is a range of values with in which the analyst can declare
with some confidence the population parameter lies. Confidence interval can be two sided
or one sided.

As a result of the central limit theorem, the following Z formula for means can be used
when sample sizes are large, regardless of the shape of the population distribution, or for
smaller sizes if the population is normally distributed.

Z=

Rearranging this formula algebraically to solve for gives

= – Z.
Because a sample mean can be greater than or less than the population mean, Z can be
positive or negative. Thus;

= Z.
Rewriting this expression yields the confidence interval formula for estimating with large
sample size.

(100 – ) %, confidence interval to estimate

Z ……………………………………………….. (2.1)
Or

–Z +Z
Where
The area under the normal curve out side the confidence interval area

1
Statistics for management

= The area in one end (tail) of the distribution outside the confidence interval

Here we use to locate the Z value in constructing the confidence interval. Because the
standard normal table is based on areas between a Z of 0 and Z , the table Z value is
found by locating the area of 0.5 – ,which is the part of the normal curve between the
middle of the curve and one of the tails. Another ways to locate this Z value is to change
the confidence level from percentage to proportion, divide it in half, and go to the table
with this value. The results are the same.

The confidence interval formula (2.1) yields a range (interval) with in which we feel with
some confidence the population mean is located. It is not certain that the population mean
is in the interval unless we have 100% confidence interval that it is infinitely wide.
However we can assign probability that the parameter ( ) is located with in the interval.
Formula 2.1 can be presented as a probability statement

P –Z +Z =1–
Z score for confidence intervals in relation to

-Z 0 Z

2
Statistics for management

1–

Confidence

2

Shaded area

Example;
Real estate broker estimate the mean family income in the area as an indicator of expected
sales. A sample of 100 families yields a mean of = 35, 500. Presume the population
standard deviation is = 7,200, given that a 95% confidence interval, is estimated as;
Confidence interval for is

Z.

3
Statistics for management

The Z value for a 95% confidence interval is 1.96 ( = 0.475, the Z value is 1.96 or Z
or Z is 0.5 – 0.025 = 0.475, the Z value is 1.96)

35,500 (1.96)
34,088.80 36,911.20

Interpretation
- The developer is 95% confident that the true unknown population mean is between $
34,088.80 and $ 36, 9911.20

Finite correction Factor


In case of interval estimation, the finite correction factor is used to reduce the width of the
interval.

Confidence interval to Estimate using the finite correction factor


–Z +Z …………………….2.2
Example
A random sample of 50 from 800 engineers reveals that the average sample age is 34.3
years. Historically, the population standard deviation of the ages of the company’s
engineers is approximately 8 years. Construct a 98% confidence interval to estimate the
average age of all the engineers in this company.
Solution
N = 800
n = 50 which is (Finite correction factor)
= 34.3
=8

Z value for a 98% confidence interval is 2.33

Using the Z formula yields


34.3 – 2.33 34.3 + 2.33
31.66 36.94
The finite correction factor takes in to account the fact that the population is only 800
instead of being infinitely large. The sample, n = 50, is greater proportion of the 800 than it
would be of a larger population, and thus the width of the confidence interval is reduced.

Confidence interval to Estimate when unknown and n is large


When sample sizes are large (n 30), the sample standard deviation is a good estimate of
the population standard deviation and can be used as an acceptable approximation of the
population standard deviation in the Z formula for a mean. Because formula based on
4
Statistics for management

central limit theorem require large samples for non normal populations, it makes sense to
modify formula (2.1) to use the sample standard deviation, S. Beware,however, not to use
this modified formula for small samples when the population standard deviation is
unknown even when the population is normally distributed.

Z Or

–Z +Z ……………………………… (2.3)
Example
Given, n = 110, = 85.5 and S =19.3, compute a 99% confidence interval to estimate

Solution
The confidence interval is
–Z +Z , Z = 2.575 Or

, 0.5 – 0.495 = 0.005 Z =2.575

85.5 – 2.575 85.5 + 2.575


80.8 90.2
With 99% confidence, we estimate that the population mean is some where between 80.8
and 90.2.

Table
-Value of Z for some of the more common levels of confidence.

Confidence Z value
Level

90% 1.645
95% 1.96
98% 2.33
99% 2.575

Estimating the population mean; small sample sizes, unknown

In many real life situations, sample sizes of less than 30 are the norm.

The t Distribution

5
Statistics for management

William S.Gosset (British statistician) developed the t distribution, which describes the
sample data in small samples when the population standard deviation is unknown and the
population is normally distributed. The formula for the t value is;

t=

The formula is essentially the same as the Z formula, but the distribution table values are
different.
The assumption underlying the use of the techniques discussed in this chapter for small
sample sizes is that the population is normally distributed. . If the population distribution is
not normal or is unknown, non parametric techniques should be used.

Characteristics of the t Distribution


▪ Symmetric,
▪ Unimodal
▪ Family of curve
▪ Flatter in the middle and have more area in their tails than the standard normal
distribution
An examination of t distribution values reveals that the t distribution approaches the
standard normal curve as n becomes large.

The t distribution is the appropriate distribution to use any time the population variance or
standard deviation is unknown, regardless of sample size. However, because the difference
between the table value for Z and t becomes negligible for large sample many researchers
use the Z distribution for large sample – analysis even when the standard deviation or
variance is unknown.
The t distribution is reserved for use with small sample size problems (n < 30) because, as
n nears size 30, the t table values approach the Z table values.

To find a value in the t distribution table requires knowing the sample size. The t
distribution table is a compilation of many t distributions, with each line of the table
representing a different sample size. However, the sample size must be converted to
degrees of freedom ( df ) before a table value can be determined.

t formula are used because the population variance or standard deviation, which is part of
the Z formula, is unknown and must be estimated by a sample standard deviation or
variance.

The t distribution table does not use the area between the statistic and the mean as does
the Z distribution. Instead t table uses the area in the tail of the distribution. The emphasis
in the t table is on and each tail of the distribution contains of the area under the curve
when confidence interval are constructed.

Degree of Freedom (df = n–1) -The number of observations that can be freely chosen
6
Statistics for management

Variance of t-distribution

Example,
Given, n= 4 observation that must produce a mean of 10. The mean of 10 serves as a
constraint and there are n-1= 3 degree of freedom.

Confidence interval to Estimate when unknown and Sample size is


small
The t formula

t=

Can be manipulated algebraically to produce a formula for estimating the population mean
using small sample when is unknown and the population is normally distributed. The
result is the formulas given next

t , or

–t , +t , ……………………………….( 2.4)
Example;
Owner of a large equipment rental company wants to make rather quick estimate of the
average number of days a piece of equipment is rented out per person per time. The owners
decide to take a random sample of rental invoices. Fourteen different rentals of the
equipment are selected randomly from the files, yielding the following data. She uses these
data to construct a 99% confidence interval to estimate the average number of days that
equipment is rented and assume that the number of days per rental is normally distributed
in the population.

3 1 3 2 5 1 2 1 4 2 1 3 1 1

As, n= 14, the df= 13, the 99% level of confidence results in = 0.005 areas in each tail
of the distribution. The table t value is
t , = 3.012
The sample mean is 2.14 and the sample standard deviation is 1.29 the confidence interval
is

7
Statistics for management

2.14 3.012 = 1.10 3.18

Prob
The point estimate of the average length of time per day rental is 2.14 days , with an error
of 1.04.

Estimating the population Proportions


Business decision makers and researchers often need to be able to estimate the population
proportion.
The central limit theorem for sample proportions led to the following formula.

Z=

Where Q = 1–P, recall that this formula can be applied only when n.p and n.Q are greater
than 5.
Algebraically manipulating this formula to estimate P involves solving for P. However P is
in both the numerator and the denominator ,which complicates the resulting formula. For
this reason – for confidence interval purposes only and for large sample size- is
substituted for P in the denominator, yielding.

Z=

Where = 1– .Solving for P resulting in the confidence interval in formula 5.5

Confidence interval to estimate P

–Z +Z ……………………………….2.5
Where
= Sample proportion
= 1–
P= Population proportion
n = Sample size
In this formula, is the point estimate and Z is the error of the estimation.

Example
8
Statistics for management

A study of 87 randomly selected companies with a telemarketing operation revealed that


39% of the sampled companies had used telemarketing to assist them in order processing.
Using this information, how could a researcher estimate the population proportion of
telemarketing companies that use their telemarketing operation to assist them in order
processing?

Solution
- = 0.39 – is the point estimate of the population proportion, P
- For n = 87, and = 0.39, a 95 % confidence interval can be computed to determine the
interval estimation of P.

-The Z value for 95% confidence is 1.96.


-The value of = 1– is 1–0.39=0.61
-The confidence interval estimation is

0.39 – 1.96 P 0.39+1.96


0.29 P 0.49
Prob 0.29 P 0.49 = 0.95
-There is a point estimate of 0.39 with an error of 10.This results has a 95 % level of
confidence.
Estimating the population variance
Suppose a researcher wants to estimate the population variance. The relation ship of the
sample variance is captured by the chi-square distribution ( ).
-Chi-square lacks robustness.
-The number of degree of freedom for the chi-square formula is n– 1
Formula for single variance

= …………………………………………………………….2.6
df = n– 1
- The Chi-square distribution is not symmetrical and its shape will vary according to the
degree of freedom

Formula 2.6 can be algebraically to produce a formula that can be used to construct
confidence intervals for population variances.

Confidence interval to estimate the population variance

9
Statistics for management

df = n– 1
The value of is equal to 1–(level of confidence expressed as a
proportion).Thus if we are constructing a 90% confidence interval, alpha is 10% of the
area and is expressed in proportion from = 0.10.

Example
Given, S = 1.12 , n= 25 develop a 95% confidence interval to estimate the population
variance. Assume the populations are normally distributed.

Solution
S = (1.12 ) = 1.2544
df= n– 1 = 25– 1 = 24
A 95% confidence means that alpha ( ) is 1– 0.95 =0.05.This value is split to determine
the area in each tail of the chi-square distribution; = 0.025. The values of the chi-square
obtained from the table are;

= 39.3641
= 12.4011
From this information, the confidence interval can be determined.

0.7648 2.4277
Prob 0.7648 2.4277

Graphically

10
Statistics for management

0.025
0.025
0.975

x x

Estimating sample size


In most business research that uses sample statistics to infer about the population, being
able to estimate the size of sample necessary to accomplish the purpose of the study is
important.

Sample size when estimating

-When is being estimated, the size of sample can be determined by using the z formula
for sample means to solve for n.

Z=

-is the error of estimation resulting from the sampling process. Let E = - the
error of estimation.

Z= Solving for n produce the sample size

Sample size when estimating

n= =

-If is unknown use the following estimate to represent .


= (range)
This estimate is derived from the empirical rule stating that approximately 95% of the
values in a normal distribution are with in a normal distribution are with in 2 of the
mean ,giving a range with in which most of the values are located.
Example
11
Statistics for management

Suppose you want to estimate the average age of the all Boeing 727 air planes how in
active domestic U.S service. You want to be 95% confident, and you want your estimate to
be with in 2 years of the actual figure. The 727was first placed in service about 30 years
ago, but you believe that no active 727 is the U.S domestic fleets are more than 25 years
old. How large a sample should you take?

Solution
E= 2 years, the Z value for 95% is 1.96, and is unknown
( ) (range)
( ) (25) = 6.25

n= = = 37.52

38
If you randomly sample 38 units, you have an opportunity to estimate the average age of
active 727 with in two years and be 95% confident of results.

12

You might also like