0% found this document useful (0 votes)
39 views

Sampling Methods 6 and 7 April

Uploaded by

Himanshu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Sampling Methods 6 and 7 April

Uploaded by

Himanshu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CENSUS AND SAMPLE

Let us try to understand the terms „census‟ and „sample‟ with the help of an
illustration. Suppose you wish to study the „impact of T.V. advertisements on
children in Delhi, then you have to collect relevant information from the children
residing in Delhi who view T.V. Alternatively, we can say this is the population
(statistical terminology) for your study. If you collect the data from all of them
not leaving a single child, it known as Census method of data collection. This
means studying the whole population. Otherwise, if you select only some children
from among them for gathering the desired information for the study,
because it is not feasible to gather the information from all the children, then it is
known as Sample for data collection. Therefore, a sample is a subset of a statistical
population whose characteristics are studied to know the information about the
whole population. When dealing with people, it can be defined as a set of
respondents (people) selected from a population for the purpose of a
survey. A population is a group of individual persons, objects, items or any other
units from which samples are taken for measurement.
A complete survey of population is called a census. It involves covering all
respondents, items, or units of the population. For example, if we want to know the
wage structure of the textile industry in the country, then one approach is to collect
the data on the wages of each and every worker in the
textile industry. On the other hand, a sample is a representative subset of
population. Thus in a sample survey we cover only a sample of respondents, items
or units of population we are interested in and then draw inferences about the
whole population.
The following are the advantages of census:
1) In a census each and every respondent of the population is considered and
various population parameters are compiled for information.
2) The information obtained on the basis of census data is more reliable and
accurate. It is an adopted method of collecting data on exceptional matters like
child labour, distribution by sex, educational level of the people etc.
3) If we are conducting a survey for the first time we can have a census instead of
sample survey. The information based on this census method becomes a base for
future studies. Similarly, some of the studies of special importance like population
data are obtained only through census.
WHY SAMPLING?
One of the decisions to be made by a researcher in conducting a survey is whether
to go for a census or a sample survey. We obtain a sample rather than a complete
enumeration (a census ) of the population for many reasons. The most important
considerations for this are: cost, size of the population, accuracy of data,
accessibility of population, timeliness, and destructive observations.
1) Cost: The cost of conducting surveys through census method would be
prohibitive and sampling helps in substantial cost reduction of surveys. Since most
often the financial resources available to conduct a survey are scarce, it is
imperative to go for a sample survey than census.
2) Size of the Population: If the size of the population is very large it is difficult to
conduct a census if not impossible. In such situations sample survey is the only
way to analyse the characteristics of a population.
3) Accuracy of Data: Although reliable information can be obtained through
census, sometime the accuracy of information may be lost because of a large
population. Sampling involves a small part of the population and a few trained
people can be involved to collect accurate data. On the other hand,
a lot of people are required to enumerate all the observations. Often it becomes
difficult to involve trained manpower in large numbers to collect the data thereby
compromising accuracy of data collected. In such a situation a sample may be
more accurate than a census. A sloppily conducted census can provide less reliable
information than a carefully obtained sample.
4) Accessibility of Population: There are some populations that are so difficult to
get access to that only a sample can be used, e.g., people in prison, birds migrating
from one place to another place etc. The inaccessibility may be economic or time
related. In a particular study, population may be so costly to reach, like the
population of planets, that only a sample can be used.
5) Timeliness: Since we are covering a small portion of a large population through
sampling, it is possible to collect the data in far less time than covering the entire
population. Not only does it take less time to collect the data through sampling but
the data processing and analysis also takes less
time because fewer observations need to be covered. Suppose a company wants to
get a quick feedback from its consumers on assessing their perceptions about a new
improved detergent in comparison to an existing version of the detergent. Here the
time factor is very significant. In such
situations it is better to go for a sample survey rather than census because it
reduces a lot of time and product launch decision can be taken quickly.
6) Destructive Observations: Sometimes the very act of observing the desired
characteristics of a unit of the population destroys it for the intended use. Good
examples of this occur in quality control. For example, to test the quality of a bulb,
to determine whether it is defective, it must be destroyed.
To obtain a census of the quality of a lorry load of bulbs, you have to destroy all of
them. This is contrary to the purpose served by quality-control testing. In this case,
only a sample should be used to assess the quality of the bulbs. Another example is
blood test of a patient. The disadvantages of sampling are few but the researcher
must be cautious. These are risk, lack of representativeness and insufficient sample
size each of which can cause errors. If researcher don‟t pay attention to these flaws
it may invalidate the results.
1) Risk: Using a sample from a population and drawing inferences about the entire
population involves risk. In other words the risk results from dealing with a part of
a population. If the risk is not acceptable in seeking a solution to a problem then a
census must be conducted.
2) Lack of representativeness: Determining the representativeness of the sample
is the researcher‟s greatest problem. By definition, „sample‟ means a representative
part of an entire population. It is necessary to obtain a sample that meets the
requirement of representativeness otherwise the sample will be biased. The
inferences drawn from nonreprentative samples will be misleading and potentially
dangerous.
3) Insufficient sample size: The other significant problem in sampling is to
determine the size of the sample. The size of the sample for a valid sample depends
on several factors such as extent of risk that the researcher is willing to accept and
the characteristics of the population itself.
4.4 ESSENTIALS OF A GOOD SAMPLE
It is important that the sampling results must reflect the characteristics of the
population. Therefore, while selecting the sample from the population under
investigation it should be ensured that the sample has the following characteristics:
1) A sample must represent a true picture of the population from which it is drawn.
2) A sample must be unbiased by the sampling procedure.
3) A sample must be taken at random so that every member of the population of
data has an equal chance of selection.
4) A sample must be sufficiently large but as economical as possible.
5) A sample must be accurate and complete. It should not leave any information
incomplete and should include all the respondents, units or items included in the
sample.
6) Adequate sample size must be taken considering the degree of precision
required in the results of inquiry.

4.5 METHODS OF SAMPLING


If money, time, trained manpower and other resources were not a concern, the
researcher could get most accurate data from surveying the entire population of
interest. Since most often the resources are scarce, the researcher is forced to go for
sampling. But the real purpose of the survey is to know the characteristics of the
population. Then the question is with what level of confidence the researcher will
be able to say that the characteristics of a sample represent the entire population.
Using a combination of tasks of hypotheses and unbiased sampling methods, the
researcher can collect data that actually represents the characteristics of the entire
population from which the sample was taken. To ensure a high level of confidence
that the sample represents the population it is necessary that the sample is unbiased
and sufficiently large.
It was scientifically proved that if we increase the sample size we shall be that
much closer to the characteristics of the population. Ultimately, if we cover each
and every unit of the population, the characteristics of the sample will be equal to
the characteristics of the population. That is why in a census there is no sampling
error. Thus, “generally speaking, the larger the sample size, the less
sampling error we have.”
The statistical meaning of bias is error. The sample must be error free to make it an
unbiased sample. In practice, it is impossible to achieve an error free sample even
using unbiased sampling methods. However, we can minimize the error by
employing appropriate sampling methods.
The various sampling methods can be classified into two categories. These are
random sampling methods and non-random sampling methods. Let us discuss them
in detail.
4.5.1 Random Sampling Methods
The random sampling method is also often called probability sampling. In random
sampling all units or items in the population have a chance of being chosen in the
sample. In other words a random sample is a sample in which each element of the
population has a known and non-zero chance of being selected. Random sampling
always produces the smallest possible sampling error. In the real sense, the size of
the sampling error in a random sample is affected only by a random chance.
Because a random sample contains the least amount of sampling error, we may say
that it is an unbiased sample. Remember that we are not saying that a random
sample contains no error, but
rather the minimum possible amount of error. The major advantage of random
sampling is that it is possible to quantify the magnitude of the likely error in the
inference made and this will help in building confidence in drawing inferences.
The following are the important methods of random sampling:
1) Simple Random Sampling
2) Systematic Sampling
3) Stratified Random Sampling
4) Cluster Sampling
5) Multistage Sampling
1. Simple Random Sampling: The most commonly used random sampling
method is simple random sampling method. A simple random sample is one in
which each item in the total population has an equal chance of being included in
the sample. In addition, the selection of one item for inclusion in the sample should
in no way influence the selection of another item. Simple random
sampling should be used with a homogeneous population, that is, a population
consisting of items that possess the same attributes that the researcher is interested
in. The characteristics of homogeneity may include such as age, sex, income,
social/religious/political affiliation, geographical region etc.
The best way to choose a simple random sample is to use random number table. A
random sampling method should meet the following criteria.
a) Every member of the population must have an equal chance of inclusion in the
sample.
b) The selection of one member is not affected by the selection of previous
members.
The random numbers are a collection of digits generated through a probabilistic
mechanism. The random numbers have the following properties:
i) The probability that each digit (0,1,2,3,4,5,6,7,8,or 9) will appear at any placeis
the same. That is 1/10.
ii) The occurrence of any two digits in any two places is independent of each other.
Each member of a population is assigned a unique number. The members of the
population chosen for the sample will be those whose numbers are identical to the
ones extracted from the random number table in succession until the desired
sample size is reached. An example of a random number table is given below.
To select a random sample using simple random sampling method we should
follow the steps given below:
i) Determine the population size (N).
ii) Determine the sample size (n).
iii) Number each member of the population under investigation in serial
order.Suppose there are 100 members number them from 00 to 99.
iv) Determine the starting point of selecting sample by randomly picking up a page
from random number tables and dropping your finger on the page blindly.
v) Choose the direction in which you want to read the numbers (from left to right,
or right to left, or down or up).
vi) Select the first „n‟ numbers whose X digits are between 0 and N. If N = 100
then X would be 2, if N is a four digit number then X would be 3 and so on.
vii) Once a number is chosen, do not use it again.
viii) If you reach the end point of the table before obtaining „n‟ numbers, pick
another starting point and read in a different direction and then use the first X digit
instead of the last X digits and continue until the desired sample is selected.
Example: Suppose you have a list of 80 students and want to select a sample of 20
students using simple random sampling method. First assign each student a number
from 00 to 79. To draw a sample of 20 students using random number table, you
need to find 20 two-digit numbers in the range 00 to 79. You can begin any where
and go in any direction. For example, start from the 6th row and 1st column of the
random number table given in this Unit. Read the last two digits of the numbers. If
the number is within the range (00 to 79) include the number in the sample.
Otherwise skip the number and read the next number in some identified direction.
If a number is already selected omit it. In the example starting from 6th row and
1st column and moving from left to right direction the following numbers are
considered to selected 20 numbers for sample.
The bold faced digits in the one‟s and ten‟s place value indicate the selected
numbers for the sample. Therefore, the following are the 20 numbers chosen as
sample.

Advantages
i) The simple random sample requires less knowledge about the characteristics of
the population.
ii) Since sample is selected at random giving each member of the population equal
chance of being selected the sample can be called as unbiased sample. Bias due to
human preferences and influences is eliminated.
iii) Assessment of the accuracy of the results is possible by sample error
estimation.
iv) It is a simple and practical sampling method provided population size is not
large.
Limitations
i) If the population size is large, a great deal of time must be spent listing and
numbering the members of the population.
ii) A simple random sample will not adequately represent many population
characteristics unless the sample is very large. That is, if the researcher is
interested in choosing a sample on the basis of the distribution in the population of
gender, age, social status, a simple random sample needs to be very large to ensure
all these distributions are representative of the population. To obtain a
representative sample across multiple population attributes we should use stratified
random sampling.
2. Systematic Sampling: In systematic sampling the sample units are selected
from the population at equal intervals in terms of time, space or order. The
selection of a sample using systematic sampling method is very simple. From a
population of „N‟ units, a sample of „n‟ units may be selected by following the
steps given below:
i) Arrange all the units in the population in an order by giving serial numbers from
1 to N.
ii) Determine the sampling interval by dividing the population by the sample size.
That is, K=N/n.
iii) Select the first sample unit at random from the first sampling interval (1 toK).
iv) Select the subsequent sample units at equal regular intervals.
For example, we want to have a sample of 100 units from a population of 1000
units. First arrange the population units in some serial order by giving numbers
from 1 to 1000. The sample interval size is K=1000/100=10. Select the first sample
unit at random from the first 10 units ( i.e. from 1 to 10). Suppose the first sample
unit selected is 5, then the subsequent sample units are 15, 25,35,.........995. Thus,
in the systematic sampling the first sample unit is selected at random and this
sample unit in turn determines the subsequent sample units that are to be selected.
Advantages
i) The main advantage of using systematic sample is that it is more expeditious to
collect a sample systematically since the time taken and work involved is less than
in simple random sampling. For example, it is frequently used in exit polls and
store consumers.
ii) This method can be used even when no formal list of the population units is
available. For example, suppose if we are interested in knowing the opinion of
consumers on improving the services offered by a store we may simply choose
every kth (say 6th) consumer visiting a store provided that we know how many
consumers are visiting the store daily (say 1000 consumers visit and we want to
have 100 consumers as sample size).
Limitations
i) If there is periodicity in the occurrence of elements of a population, the selection
of sample using systematic sample could give a highly un-representative
sample.For example, suppose the sales of a consumer store are arranged
chronologically and using systematic sampling we select sample for 1st of every
month. The 1st day of a month can not be a representative sample for the whole
month. Thus in systematic sampling there is a danger of order bias.
ii) Every unit of the population does not have an equal chance of being selected
and the selection of units for the sample depends on the initial unit selection.
Regardless how we select the first unit of sample, subsequent units are
automatically determined lacking complete randomness.
3. Stratified Random Sampling: The stratified sampling method is used when the
population is heterogeneous rather than homogeneous. A heterogeneous population
is composed of unlike elements such as male/female, rural/urban, literate/illiterate,
high income/low income groups, etc. In such cases, use of simple random sampling
may not always provide a representative sample of the
population. In stratified sampling, we divide the population into relatively
homogenous groups called strata. Then we select a sample using simple random
sampling from each stratum. There are two approaches to decide the sample size
from each stratum, namely, proportional stratified sample and
disproportional stratified sample. With either approach, the stratified sampling
guarantees that every unit in the population has a chance of being selected. We will
now discuss these two approaches of selecting samples.
i) Proportional Stratified Sample: If the number of sampling units drawn from
each stratum is in proportion to the corresponding stratum population size, we say
the sample is proportional stratified sample. For example, let us say we want to
draw a stratified random sample from a heterogeneous population (on some
characteristics) consisting of rural/urban and male/female respondents.
So we have to create 4 homogeneous sub groups called stratums as follows:

To
ensure each stratum in the sample will represent the corresponding stratum in the
population we must ensure each stratum in the sample is represented in the same
proportion to the stratums as they are in the population. Let us assume that we
know (or can estimate) the population distribution as follows:
65% male, 35% female and 30% urban and 70% rural. Now we can determine the
approximate proportions of our 4 stratums in the population as shown below.

Thus a representative sample would be composed of 19.5% urban-males, 10.5%


urban-females, 45.5% rural-males and 24.5% rural females. Each percentage
should be multiplied by the total sample size needed to arrive at the actual sample
size required from each stratum. Suppose we require 1000 samples then the
required sample in each stratum is as follows:

ii) Disproportional Stratified Sample: In a disproportional stratified sample,


sample size for each stratum is not allocated on a proportional basis with the
population size, but by analytical considerations of the researcher such as stratum
variance, stratum population, time and financial constraints etc. For example, if the
researcher is interested in finding differences among different
stratums, disproportional sampling should be used. Consider the example of
income distribution of households. There is a small percentage of households
within the high income brackets and a large percentage of households within the
low income brackets. The income among higher income group households has
higher variance than the variance among the lower income group households.

To avoid under-representation of higher income groups in the sample, a


disproportional sample is taken. This indicates that as the variability within the
stratum increases sample size must increase to provide accurate estimates and vice-
versa.
Suppose in our example of urban/rural and male/female stratum populations, the
stratum estimated variances (s2) are as follows. However, the variance is discussed
in Unit 9 of this course.
Urban-male 3.0; Urban-female 5.5; Rural-males 2.5; Rural-females 1.75.

The above figures are, normally, estimated on the basis of previous knowledge of a
researcher.
Then the allocation of sample size of 1000 for each strata using disproportional
stratified sampling method will be as shown in the following table:

Advantages
a) Since the sample are drawn from each of the stratums of the population,
stratified sampling is more representative and thus more accurately reflects
characteristics of the population from which they are chosen.
b) It is more precise and to a great extent avoids bias.
c) Since sample size can be less in this method, it saves a lot of time, money and
other resources for data collection.
Limitations
a) Stratified sampling requires a detailed knowledge of the distribution of attributes
or characteristics of interest in the population to determine the homogeneous
groups that lie within it. If we cannot accurately identify the homogeneous groups,
it is better to use simple random sample since improper stratification can lead to
serious errors.
b) Preparing a stratified list is a difficult task as the lists may not be readily
available.
4. Cluster Sampling: In cluster sampling we divide the population into groups
having heterogenous characteristics called clusters and then select a sample of
clusters using simple random sampling. We assume that each of the clusters is
representative of the population as a whole. This sampling is widely used for
geographical studies of many issues. For example if we are interested in finding
The consumers‟ (residing in Delhi) attitudes towards a new product of accompany,
the whole city of Delhi can be divided into 20 blocks. We assume that each of
these blocks will represent the attitudes of consumers of Delhi as a whole, we
might use cluster sampling treating each block as a cluster. We will then select a
sample of 2 or 3 clusters and obtain the information from consumers covering all
of them. The principles that are basic to the cluster sampling are as follows:
i) The differences or variability within a cluster should be as large as possible. As
far as possible the variability within each cluster should be the same as that of the
population.
ii) The variability between clusters should be as small as possible. Once the
clusters are selected, all the units in the selected clusters are covered for obtaining
data.
Advantages
a) The cluster sampling provides significant gains in data collection costs, since
traveling costs are smaller.
b) Since the researcher need not cover all the clusters and only a sample of clusters
are covered, it becomes a more practical method which facilitates fieldwork.
Limitations
a) The cluster sampling method is less precise than sampling of units from the
whole population since the latter is expected to provide a better cross-section of the
population than the former, due to the usual tendency of units in a cluster to be
homogeneous.
b) The sampling efficiency of cluster sampling is likely to decrease with the
decrease in cluster size or increase in number of clusters. The above advantages or
limitations of cluster sampling suggest that, in practical situations where sampling
efficiency is less important but the cost is of greater
significance, the cluster sampling method is extensively used. If the division of
clusters is based on the geographic sub-divisions, it is known as area sampling.
In cluster sampling instead of covering all the units in each cluster we can resort to
sub-sampling as two-stage sampling. Here, the clusters are termed as primary units
and the units within the selected clusters are taken as secondary units.
5. Multistage Sampling: We have already covered two stage sampling. Multi
stage sampling is a generalisation of two stage sampling. As the name suggests,
multi stage sampling is carried out in different stages. In each stage progressively
smaller (population) geographic areas will be randomly selected.
A political pollster interested in assembly elections in Uttar Pradesh may first
divide the state into different assembly units and a sample of assembly
constituencies may be selected in the first stage. In the second stage, each of the
sampled assembly constituents are divided into a number of segments and a second
stage sampled assembly segments may be selected. In the third stage within each
sampled assembly segment either all the house-holds or a sample random of
households would be interviewed. In this sampling method, it is possible to take as
many stages as are necessary to achieve a representative sample. Each stage results
in a reduction of sample size.
In a multi stage sampling at each stage of sampling a suitable method of sampling
is used. More number of stages are used to arrive at a sample of desired sampling
units.
Advantages
a) Multistage sampling provides cost gains by reducing the data collection on
costs.
b) Multistage sampling is more flexible and allows us to use different sampling
procedures in different stages of sampling.
c) If the population is spread over a very wide geographical area, multistage
sampling is the only sampling method available in a number of practical situations.
Limitations
a) If the sampling units selected at different stages are not representative multistage
sampling becomes less precise and efficient.

You might also like