0% found this document useful (0 votes)
46 views

Sampling Dist

Sampling Distribution for Business Statistics

Uploaded by

Sanket Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Sampling Dist

Sampling Distribution for Business Statistics

Uploaded by

Sanket Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Sampling Distribution

Statistics Vs parameters
 Statistic – is a numerical value computed from a sample.

 Parameter – is a numerical value associated with a


population.

 Essentially, we would like to know the parameter. But in


most cases it is hard to know the parameter since the
population is too large. So we have to estimate the
parameter by some proper statistics computed from the
sample.
Introduction
• A sampling distribution of a sample statistic
(say mean) is the probability distribution of
that sample statistic of all the possible
samples of a given size.
Example Contd..
Central Limit Theorem (CLT)
• The mean of the population is always equal to
the mean of the sampling distribution of mean.
• As the sample size increases, the sampling
distribution of mean will approach normality,
regardless of the shape of the distribution.
• This relationship between the shape of the
population distribution and the shape of the
sampling distribution of mean is called as
Central Limit Theorem.
Contd..
• It is perhaps the most important theorem in
inferential statistics.
• The significance of CLT is that without the knowledge
of the frequency distribution of the population, it
permits us to use the sample statistics to make
inferences about the population parameters.
• We will use symbols (µ, σ) for mean and S.D. of
population and (µ x , σ x ) for mean and standard
error of sampling distribution, respectively.
Definition of standard error of mean from the population S.D

Standard error of a statistic is the standard deviation of the


sampling distribution of that statistic. Here n and N are the
sample and the population size, respectively.
An example
• Lets take a population of annual earnings of class
II employees (mean = $19000 and S.D. = $2000) in
all nationalized banks. If we draw a random
sample of 30 employees, what is the probability
that their annual average earnings will be more
than $19, 750?
• The figures in the next slide depicts the
population distribution and sampling distribution
of mean, respectively.
Sampling Distribution of Proportion
• Used when issue of interest is categorical in
nature
• Classified as occurrences or non-occurrences
• We estimate the proportion of occurrences in
such cases
• Since complete information about population is
not available, we use sample proportion to
estimate true proportion
• Let x is number of occurrences out of a total of sample size n
• x will follow a binomial distribution with prob. of occurrence
as p

According to Binomial distribution,


• The sampling distribution of sample statistic (sample
proportions) will be

• If n is large, we can approximate this binomial


distribution to normal distribution,
An example on sampling distribution of
proportion

• A certain company’s customers is made up of


43% women and 57% men. As per a sample of
50 customers, how likely is it that 46% or more
of customers are women?
Solution
• Step 1: Check that your sample size is large enough:
o n * p = 50 * .43 = 21.5
o n * (1-p) = 50(1-.43) = 28.5.
Both are above 5, so we can use the binomial approximation to normal distribution.
• Step 2: Find the standard error (S.E.):
√ (p(1-p) / n) = √ (0.43(1-0.43) / 50) = 0.07.

• Step 3: Find the z-score, using the S.E. you calculated in Step 2:
z = (P̄ – p) / SE
= (0.46 – 0.43)/0.07 = 0.43.

• Step 4:  P(Z≥0.43) = 0.3336 or 33.36%.


The relationship between sample size and
standard error
• Standard error is a measure of the dispersion of
the sample means around the population mean.
• As it decreases, the values taken by sample means
tends to cluster more closely around population
mean, and vice versa.
• In other words, if S.E. decreases, the value of any
sample mean will probably be closer to the value
of the population mean.
• Putting it differently, if S.E. decreases, the
precision with which the sample mean can be used
to estimate the population mean increases.
The Finite Population Multiplier
Sampling Fraction
• When the population size N is very large relative to
the sample size n, the finite population multiplier
takes a value close to 1.
• The fraction n/N is called the sampling fraction.
• When the sampling fraction is small, the standard
error of mean for the finite population is very close
to the standard error of mean for the infinite
population.
• Hence the same formula can be used for both of
them, i.e.
• As a rule of thumb, when the sampling fraction is less
than 0.05, the finite population multiplier need not
to be used.
A Problem
Answers
(a) 0.9836
(b) 34
Estimation
• There are two types of inference: estimation and
hypothesis testing; estimation is introduced first.

• The objective of estimation is to determine the


approximate value of a population parameter on
the basis of a sample statistic.

• E.g., the sample mean ( ) is employed to


estimate the population mean ( ).

10.23
Estimation…
• The objective of estimation is to determine the
approximate value of a population parameter on
the basis of a sample statistic.

• There are two types of estimators:

• Point Estimator

• Interval Estimator
10.24
Point Estimator…
• A point estimator draws inferences about a
population by estimating the value of an unknown
parameter using a single value or point.

• We expect that the point estimator gets closer to the


parameter value with an increased sample size, but
point estimators don’t reflect the effects of larger
sample sizes. Hence we will employ the interval
estimator to estimate population parameters.
10.25
Interval Estimator…
• An interval estimator draws inferences about a
population by estimating the value of an unknown
parameter using an interval.

• That is we say (with some ___% certainty) that the


population parameter of interest is between some lower
and upper bounds.
10.26
Point & Interval Estimation…
• For example, suppose we want to estimate the mean summer
income of a class of business students. For n=25 students,
• is calculated to be 400 $/week.

• point estimate interval estimate

• An alternative statement is:


• The mean income is between 380 and 420 $/week.

10.27
Qualities of Estimators

Qualities desirable in estimators include unbiasedness,


consistency and relative efficiency.
• An unbiased estimator of a population parameter is an
estimator whose expected value is equal to that
parameter.
• An unbiased estimator is said to be consistent if the
difference between the estimator and the parameter
grows smaller as the sample size grows larger.
• If there are two unbiased estimators of a parameter, the
one whose variance is smaller is said to be relatively
efficient. 10.28
Interval Estimates and Confidence Intervals

• In Statistics, the probability that we associate with an


interval estimate is called the confidence level.
• This probability indicates that how confident we are that
the interval estimate will include the population
parameter.
• Normally, 90%, 95% and 99% C.L. are often used.
• The Confidence interval is the range of the estimate that
we are making.
• Whenever the sample size is more than or equal to 30,
normal distribution is the appropriate sampling
distribution to be used to determine confidence intervals.
Calculating interval estimates of the mean
from large samples when population S.D. is
known

• Calculate the interval estimate of the mean


life of the population of wiper blades when a
sample of size 100 is drawn with mean life of
21 months where the s.d. of the population is
6 months.

• Take the confidence level as 95%.


Areas of the Standard Normal Distribution
When the population S.D. is unknown
Calculating interval estimates of the
proportion from large samples
Interval estimates using the t distribution
• Whenever the sample size is more than or equal to 30,
normal distribution is the appropriate sampling distribution
to be used to determine confidence intervals.
• But how to handle cases when we are to estimate
population S.D. and the sample size is less than 30?
• An other distribution called the t distribution (or Student’s t
distribution) is used in such cases.
• This is always to keep in mind that the t distribution is used
when the sample size is less than 30 and the population
S.D. is not known.
• Furthermore, while using t distribution, we assume that the
population is normal or approximately normal
Characteristics of the t distribution

• A t-distribution is like a Z (normal) distribution,


except has slightly fatter tails.
• The bigger the sample size (i.e., the bigger the
sample size used to estimate ), more closer t
becomes to Z.
• If n>100, t approaches Z.
Student’s t Distribution

Note: t Z as n increases

Standard
Normal
(t with df = )

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
from “Statistics for Managers” Using Microsoft ® Excel 4th Edition, Prentice-Hall 2004
Degrees of Freedom
• A separate t distribution exists for each sample size.
• In statistical language, there is a different t
distribution for each of the possible degrees of
freedom.
• It can be defined as the number of values we can
choose freely.
• For a sample of size n, we will choose (n-1) degrees
of freedom for estimating the population mean
using t distribution.
Estimating standard error of the mean of an
infinite population

• A plant manager wants to estimate the coal needed for this year and he took
a sample by measuring coal usage for last 10 weeks.
Here, n = 10 weeks
df = 9,
x = 11,400 tons
s = 700 tons
Build a 95% confidence interval for the mean weekly usage of coal based
upon this sample.
t Table
Summary of Formulas for Confidence Limits
Estimating Mean and Proportion

You might also like