STAT 20053 Lesson 2 Data Collection, Sampling Techniques and Sample Size Determination
STAT 20053 Lesson 2 Data Collection, Sampling Techniques and Sample Size Determination
7. Write special instructions for interviewers or Question wording and question order have a
respondents. large effect on the responses obtained.
9. Always test your questions before taking the Two surveys were taken in late 1993/early
survey. (Pre-test) 1994 about Elvis Presley.
An open-ended question is a type of question One survey asked: “In the past few years,
that does not include response categories. The there have been a lot of rumors and stories
respondent is not given any possible answers about whether Elvis Presley is really dead.
to choose from. This type of question is usually How do you feel about this? Do you think there
appropriate for collecting subjective data. It is any possibility that these rumors are true
permit free responses that should be recorded and that Elvis Presley is still alive, or don’t you
in the respondent’s own words. think so?”
Second survey asked: “A recent television - Unrealistic Controlled Environments
show examined various theories about Elvis
- Inability to Control for All Variables
Presley’s death. Do you think it is possible that
Elvis is alive or not?” 5. Observation is a technique that involves
systematically selecting, watching and
8% of the respondents to the first question said
recoding behaviors of people or other
it is possible that Elvis is still alive and 16% of
phenomena and aspects of the setting in which
respondents to the second question said it is
they occur, for the purpose of getting (gaining)
possible that Elvis is still alive.
specified information. It includes all methods
3. A focus group is a group interview of from simple visual observations to the use of
approximately six to twelve people who share high level machines and measurements,
similar characteristics or common interests. A sophisticated equipment or facilities such as:
facilitator guides the group based on a - Radiographic
predetermined set of topics.
- biochemical
4. Experiment is a method of collecting data
where there is direct human intervention on the - X-ray machines
conditions that may affect the values of the
- Microscope
variable of interest.
- Clinical examinations
Bear in mind that the experimental method has
several limitations that you should be aware of. - Microbiological examinations
The secondary data can be collected by the The sample size is typically denoted by n and
following five methods: it is always a positive integer. No exact sample
size can be mentioned here and it can vary in
1. Published report on newspaper and different research settings. However, all else
periodicals. being equal, large sized sample leads to
increased precision in estimates of various
2. Financial Data reported in annual reports.
properties of the population.
3. Records maintained by the institution.
Take Note!
4. Internal reports of the government
- Representativeness, not size, is the more
departments.
important consideration.
5. Information from official publications.
- Use no less than 30 subjects if possible.
Take Note!
- If you use complex statistics, you may need
• Always investigate the validity and reliability a minimum of 100 or more in your sample
of the data by examining the collection (varies with method).
method employed by your source.
SAMPLE SIZE
3. Degree of Variability
( e )
Zσ
determine the appropriate sample size: n≥
1. Level of Precision
where:
Also called sampling error, the level of
precision, is the range in which the true value Z is the z-score corresponding to level of
of the population is estimated to be. confidence.
( 0.03 )
1.96(0.5)
n≥ = 1067.11 When p = 0.5, the maximum value of
p(1- p)=0.25. This is called the most
conservative estimate, since it gives the
We need a 1068 sample for our study. largest possible estimate of n.
• Estimating Proportion (Infinite The conservative formula using the strong law
Population) of large number.
4 (e)
confidence interval for p with specified margin 1 Z
n≥ ≈ 385
of error e is given by
2 Where:
(e)
Z
n≥ p(1 − p)
Confidence level is 95%.
( 0.01 )
2.58
n≥ 0.5(1 − 0.5) = 16,641 N is the population size.
N
n≥
1 + Ne 2
Where:
Example:
The researcher need to survey 286 BS stat - Important that the individuals included in a
students. sample represent a cross section of
individuals in the population.
• Finite Population Correction
- If sample is not representative it is biased.
If the population is small then the sample size You cannot generalize to the population from
can be reduced slightly your statistical data.
n0
n≥ Some definitions are needed to make the
n −1
1+ o notion of a good sample more precise.
N
Definitions: - Deliberately or purposively selecting a
“representative” sample.
• Observation unit - An object on which a Misspecifying the target population.
measurement is taken. This is the basic unit Failing to include all of the target population
of observation, sometimes called an element. in the sampling frame, called
In studying human populations, observation undercoverage.
units are often individuals. Including population units in the sampling
frame that are not in the target population,
• Target population - The complete collection
called overcoverage.
of observations we want to study.
- Having multiplicity of listings in the sampling
• Sampled population - The collection of all
frame.
possible observation units that might have
Substituting a convenient member of a
been chosen in a sample; the population
population for a designated member who is
from which the sample was taken.
not readily available.
• Sample - A subset of a population.
- Failing to obtain responses from all of the
• Sampling unit - A unit that can be selected chosen sample. (Nonresponse)
for a sample. We may want to study
- Allowing the sample to consist entirely of
individuals, but do not have a list of all
volunteers.
individuals in the target population. Instead,
households serve as the sampling units, and Advantage of Sampling Over Complete
the observation units are the individuals Enumeration
living in the households.
- Less Labor
• Sampling frame - A list, map, or other
specification of sampling units in the - Reduced Cost
population from which a sample may be - Greater Speed
selected. For a survey using in-person
interviews, the sampling frame might be a list - Greater Scope
of all street addresses.
- Greater Efficiency and Accuracy
• Sampling technique/Sampling Strategies - - Convenience
It is a plan you set forth to be sure that the
sample you use in your research study - Ethical Considerations
represents the population from which you
Two Type of Samples
drew your sample.
1. Probability Sample
• Sampling Bias - This involves problems in
your sampling, which reveals that your - Samples are obtained using some objective
sample is not representative of your chance mechanism, thus involving
population. randomization.
The following examples indicate some ways in
which selection bias can occur:
- They require the use of a complete listing of - Most basic method of drawing a probability
the elements of the universe called the sample.
sampling frame.
- Assigns equal probabilities of selection to
- The probabilities of selection are known. each possible sample.
Sampling Procedure
N PopulationSize
k= =
n SampleSize
Example:
• Stratified Random Sampling
We want to select a sample of 50 students
- It is obtained by separating the population
from 500 students under this method kth item
into non-overlapping groups called strata
and picked up from the sampling frame.
and then obtaining a simple random sample
Solution: from each stratum.
500
k= = 10 - The individuals within each stratum should
50 be homogeneous (or similar) in some way.
We start to get a sample starting form i and for
every kth unit subsequently. Suppose the Example:
random number i is 6, then we select 15, 25,
A sample of 50 students is to be drawn from a
35, 45, .. .
population consisting of 500 students
Advantage: Drawing of the sample is easy. It belonging to two institutions A and B. The
is easy to administer in the field, and the number of students in the institution A is 200
sample is spread evenly over the population. and the institution B is 300. How will you draw
the sample using proportional allocation?
Disadvantage: May give poor precision when
unsuspected periodicity is present in the
population.
Given:
(N) ( 500 )
n 50
n1 = N1 = 200 = 20
(N) ( 500 )
n 50
n2 = N2 = 300 = 30
Example:
Disadvantage: In actual field applications, 1. Organize the sampling process into stages
adjacent households tend to have more similar where the unit of analysis is systematically
characteristics than households distantly apart. grouped.
Example:
Used probability sampling if the main objective • Purposive Sampling - It is based on certain
of the sample survey is making inferences criteria laid down by the researcher. People
about the characteristics of the population who satisfy the criteria are interviewed. It is
under study. used to determine the target population of
those who will be taken for the study.
• Judgement Sampling - selects sample in ACTIVITIES/ASSESSMENTS:
accordance with an expert’s judgment.
I. Determine if the source would be a primary
Cases wherein Non-Probability Sampling is or a secondary source.
Useful
______________1. Government Records
- Only few are willing to be interviewed
______________2. Dictionary
- Extreme difficulties in locating or identifying
subjects ______________3. Artifact
REFERENCES: