Topic 1 Sampling
Topic 1 Sampling
POPULATION
■ Example:
Listing income of all the residents of Delhi-NCR
SAMPLE
■ Subset of Population
■ Example:
■ SAMPLE 1:
Listing income of only males from Delhi-NCR who are in the age group of
20-50
■ SAMPLE 2:
Listing income of all the females from Delhi-NCR
NOTE: All statistical tools are applied on sample (like, mean, standard
deviation etc.). They are Known as STATISTIC/ESTIMATOR and their value
is known as ESTIMATE.
SAMPLE
Example:
1. Mean of a sample is 200 then
STATISTIC/ESTIMATOR IS Mean
Estimate is 200
2. Standard deviation of a sample is 0.6
STATISTIC/ESTIMATOR IS Standard Deviation
Estimate is 0.6
RANDOM SAMPLING
■ Every element has equal chance of being selected
■ STATISTICALLY INDEPENDENT:
If population is very large or infinitely large and the elements in sample
are selected by chance then the outcome of anyone drawing will have
practically no effect on the outcome of any other drawing i.e. the N
sample drawing can be regarded as statistically independent.
If population size is (10)^10 and we take sample N=1 (i.e. one
observation) then its probability is 1/(10)^10
Then we draw another sample of 1 observation only, its probability is
1/(10)^10-1
Now, 1/(10)^10 is approximately equal to 1/(10)^10-1
Thus, implying statistical independence
TWO USES OF RANDOM
■ SAMPLING
HYPOTHESIS TESTING- here we test whether the sample is taken out
of population of a certain kind with certain parameters.
Example: If savings rate of a sample is 12.5% then we will test whether
this observation is applied to population or not, whether this sample
represents the population or not.
■ TO MAKE INFERENCES ABOUT THE LIKELY VALUES OF THE PARAMETER
(“PARAMETER” DENOTES POPULATION, always constant)
NOTE: Sample characteristic (STATISTIC/ESTIMATOR) may or may not be
equal to population characteristic (PARAMETER)
Example: Sample Mean may or may not be equal to population mean
SAMPLING ERRORS
The above four samples are of equal size- N and they are only possible samples out
of some population. The above table shows the Sampling Distribution of Mean.
Here probability of mean=60 is 2/4, therefore it is also known as probability
distribution of mean.
Example 2: Population is X1=8, X2=10, X3=12. Now if we take out sample of size
2,then we have:
SAMPLES MEAN
SAMPLE 1 (8,10) 9
SAMPLE 2 (10,12) 11
SAMPLE 3 (8,12) 10
PROPERTIES OF SAMPLING
DISTRIBUTION
1. The distribution of Xbar will be normal
2. The mean of the sampling distribution will be same as the mean of the population
from which samples were drawn which implies Xbar is an unbiased estimator for µ
3. Standard Error of sampling distribution is less than the standard deviation of the
population
STANDARD ERROR
The standard deviation of the sampling distribution is known as “The Standard Error” or
“The Standard Error of the Mean”
The variance of sample mean around the population mean is the sampling error and is
measured using standard error of the mean.
This is important because, sampling theory tells us:
68% of all sample means will lie between + or – one standard error from the population
mean
95% of all sample means will lie between + or – 1.96 standard errors from the population
mean
FORMULA:
Standard error () =
CENTRAL LIMIT THEOREM
• If we select a large number of simple random samples, from any population distribution
and find the mean of each sample, the distribution of these sample means will tend to be
described by the normal probability distribution with a mean µ and variance provided that
N is sufficiently large
• Sampling distribution of sample means approaches to a normal distribution.
EXERCISE
A population consists of five numbers 2,3,6,8 and 11. Consider all possible samples of
size 2 that can be drawn with and without replacement from this population.
INTERVAL ESTIMATION:
An interval estimation is defined by two numbers, between which a population
parameter is said to lie. Eg:
a< Xbar <b is an interval estimate of the population mean µ