Ampling Echniques: Ntroduction
Ampling Echniques: Ntroduction
INTRODUCTION
Many professions (business, government, engineering, science, social research, agriculture, etc.) seek the broadest possible factual basis for decision-making. In the absence of data on the subject, a decision taken is just like leaping into the dark. Sampling is a procedure, where in a fraction of the data is taken from a large set of data, and the inference drawn from the sample is extended to whole group. [Raj, p4] The surveyors (a person or a establishment in charge of collecting and recording data) or researchers initial task is to formulate a rational justification for the use of sampling in his research. If sampling is found appropriate for a research, the researcher, then: (1) Identifies the target population as precisely as possible, and in a way that makes sense in terms of the purpose of study. [Salant, p58] (2) Puts together a list of the target population from which the sample will be selected. [Salant, p58] [Raj, p4] This list is termed as a frame (more appropriately list frame) by many statisticians. (3) Selects the sample, [Salant, p58] and decide on a sampling technique, and; (4) Makes an inference about the population. [Raj, p4] All these four steps are interwoven and cannot be considered isolated from one another. Simple random sampling, systematic sampling, stratified sampling fall into the category of simple sampling techniques. Complex sampling techniques are used, only in the presence of large experimental data sets; when efficiency is required; and, while making precise estimates about relatively small groups within large populations [Salant, p59]
SAMPLING TERMINOLOGY
A population is a group of experimental data, persons, etc. A population is built up of elementary units, which cannot be further decomposed. A group of elementary units is called a cluster. Population Total is the sum of all the elements in the sample frame. Population Mean is the average of all elements in a sample frame or population. The fraction of the population or data selected in a sample i s called the Sampling Fraction. The reciprocal of the sampling fraction is called the Raising Factor. A sample, in which every unit has the same probability of selection, is called a random sample. If no repetitions are allowed, it is termed as a simple random sample selected without replacement. If repetitions are permitted, the sample is selected with replacement.
SAMPLING ERRORS
Sampling errors occur as a result of calculating the estimate (estimated mean, total, proportion, etc) based on a sample rather than the entire population. This is due to the fact that the estimated figure obtained from the sample may not be exactly equal to the true value of the population. For example, [Raj, p4] if a sample of blocks is used to estimate the total number of persons in the city, and the blocks in the sample are larger than the average then this sample will overstate the true population of the city. When results from a sample survey are reported, they are often stated in the form plus or minus of the respective units being used. [Salant, p72] This plus or minus reflects sampling errors. In [Salant, p73], Salant and Dilman, describe, that the statistics based on samples drawn from the same population always vary from each other (and from the true population value) simply because of chance. This variation is sampling error and the measure used to estimate the sampling error is the standard error. Standard errors are usually used to quantify the precision of the estimates. Sample distribution theory, points out that about 68 percentage of the estimates lie within one standard error or standard deviation of the mean, 95 percentages lie within two standard deviations and all estimates lie within three standard deviations. [Cochran] [Sukhatme] [Raj +] [Raj, p16-17] Sampling errors can be minimized by proper selection of samples, and in [Salant, p73], Salant and Dilman state Three factors affect sampling errors with respect to the design of samples the sampling procedure, the variation within the sample with respect to the variate of interest, and the size of the sample. [Yamane] adds that a large sample results in lesser sampling error
NONSAMPLING ERRORS
The accuracy of an estimate is also affected by errors arising from causes such as incomplete coverage and faulty procedures of estimation, and together with observational errors, these make up what are termed nonsampling errors. [Sukhatme, p381] The aim of a survey is always to obtain information on the true population value. The idea is to get as close as possible to the latter within the resources available for survey. The discrepancy between the survey value and the corresponding true value is called the observational error or response error. [Hansen +] Response Nonsampling errors occur as a result of improper records on the variate of interests, careless reporting of the data, or deliberate modification of the data by the data collectors and recorders to suit their interests. [Raj, p9697] [Sukhatme, p381] Nonresponse error [Cochran, p355-361] occurs when a significant number of people in a survey sample are either absent; do not respond to the questionnaire; or, are different from those who do in a way that is important to the study. [Salant, p20-21]
BIAS
Although judgment sampling is quicker than probability sampling, it is prone to systematic errors. For example, if 20 books are to be selected from a total of 200 to estimate the average number of pages in a book, a surveyor might suggest picking out those books which appear to be of average size. The difficulty with such a procedure is that consciously or unconsciously, the sampler will tend to make errors of judgment in the same
direction by selecting most of the books which are either bigger than the average of otherwise. [Raj, p9] Such systematic errors lead to what are called biases. [Rosenthal]