0% found this document useful (0 votes)
131 views

Chapter 2 Sampling and Data Collection

The document summarizes methods for collecting data including observation, experimentation, and surveys. It discusses why sampling is used and common sampling methods like simple random sampling, stratified random sampling, cluster sampling, and systematic sampling. It also covers sampling error and non-sampling error which can occur from selection bias, measurement bias, and nonresponse bias.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views

Chapter 2 Sampling and Data Collection

The document summarizes methods for collecting data including observation, experimentation, and surveys. It discusses why sampling is used and common sampling methods like simple random sampling, stratified random sampling, cluster sampling, and systematic sampling. It also covers sampling error and non-sampling error which can occur from selection bias, measurement bias, and nonresponse bias.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

CHAPTER 2

DATA COLLECTION AND


SAMPLING
Points to highlight
 Methods of collecting data
 Observation
 Experiment
 Survey
 Why sampling
 Sampling methods
 Simple Random Sampling
 Stratified Random Sampling
 Cluster Sampling
 Systematic Sampling
 Sampling and Non-sampling error

2
Primary Secondary
Data Collection Data Compilation

Print or Electronic
Observation Survey

Experimentation
I. Methods of collecting data
1. Observation
 The
investigator observes characteristics of a subset of
members of one or more existing population.
 Goal: draw conclusions about the corresponding
population or about the difference between two or more
populations.
 Advantage vs Disadvantage
o Advantage: easy to conduct, relatively inexpensive
o Disadvantage: provide little useful information;
impossible to draw cause-and-effect conclusions due to
confounding variable

4
Observation
Example
A researcher for a pharmaceutical company
wants to determine whether aspirin does
reduce the incidence of heart attacks. He
select a sample of men and women and asking
each whether he or she has taken aspirin
regularly over the past 2 years. Each person
would be asked whether he or she had
suffered a heart attack over the same period.
The proportions reporting heart attacks would
be compared and a conclusion can be drawn
whether aspirin is effective in reducing the
likelihood of heart attacks.
5
I. Methods of collecting data
2. Experiment
 The investigator observes how a response variable
behaves when the researcher manipulates one or more
explanatory variables (factors).
 Goal: determine the effect of the manipulated factors
on the response variable
 Advantage vs Disadvantage
o Advantage: provide useful data particularly for cause-
and-effect relationship
o Disadvantage: relatively expensive, time required.

6
Experiment
Example
A researcher for a pharmaceutical company
wants to determine whether aspirin does
reduce the incidence of heart attacks. He
select a sample of men and women. The
sample would be divided into two groups: one
group would take aspirin regularly and the
other would not. After 2 years, the researcher
would determine the proportion of people in
each group who had suffered a heart attack.
Then, it is possible to draw conclusion
whether aspirin is effective in reducing the
likelihood of heart attacks.
7
I. Methods of collecting data
3. Survey
One of the most familiar methods of collecting data
Goal: Used to solicit information from people concerning
things as income, family size, opinions on various issues…
 The majority of surveys are conducted for private use
 Examples:
o market researchers conduct a survey to determine the
preferences and attitudes of consumers which will help
target a new product;
o A company surveys customers’ satisfaction on their
products and service.

8
SURVEY

TELEPHONE
PERSONAL INTERVIEW MAIL SURVEY
INTERVIEW

- Inexpensive
- High rate of
- Less expensive - Low response
response, fewer
- Less personal, rate, high
incorrect answers
lower response number of
- Costly: people,
rate incorrect
money, time…
answers
9
 Define the issue
 what are the purpose and objectives of the survey
 Identify the questions to answer?
 Deciding what to measure and how to measure
 Decide what information needed to answer questions
 Think about how you intend to tabulate and analyze
the response

Define the population of interest


Survey Design Steps
10
Design questionnaire
 Questionnaire should be kept as short as possible
The questions should be short, simple, clear,
unambiguous
Begin with simple demographic questions
Use both dichotomous questions (close–ended)
questions as well as open – ended question
 Avoid using leading questions

Survey Design Steps


11
 Pre-test the survey
 pilot test with a small group of participants
 assess clarity and length
 Determine the sample size and sampling method
 Select Sample and administer the survey

Survey Design Steps

12
 Close-ended Questions
* Select from a short list of defined choices
Example: Major: __business __liberal arts
__science __other
 Open-ended Questions
* Respondents are free to respond with any value, words, or
statement
Example: What did you like best about this course?

 Demographic Questions
* Questions about the respondents’ personal characteristics
Example: Gender: __Female __ Male

Types of Questions
13
II. SAMPLING METHODS
1/ Why Sampling
- Less time consuming than a census

- Less costly to administer than a census

- It is possible to obtain statistical results of a sufficiently


high precision based on samples.

- Sometimes, it’s impossible to identify the whole


population

14
POPULATION VS SAMPLE

 All likely voters in the  1000 voters selected at


next election random for interview

 All parts produced  A few parts selected for


today destructive testing

 All sales receipts of a  Every 100th receipt


year selected for audit

15
2/ Methods of Sampling

Probability Samples

Simple
Stratified Systematic Cluster
Random Random

16
Simple Random Samples
 Every individual or item from the population
has an equal chance of being selected
 Selection may be with replacement or
without replacement
 Samples can be obtained from a table of
random numbers or computer random number
generators

17
Stratified Random Samples
 Population divided into subgroups (called
strata) according to some common characteristic
 Simple random sample selected from each
subgroup
 Samples from subgroups are combined into one

Population
Divided
into 4
strata

18 Sample
Systematic Samples
 Decide on sample size: n
 Divide frame of N individuals into groups of k
individuals: k=N/n
 Randomly select one individual from the 1st
group
 Select every kth individual thereafter

N = 64
n=8 First Group
k=8 19
Cluster Samples
*Population is divided into several “clusters,”
each representative of the population
*A simple random sample of clusters is selected
* All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling
technique

Population
divided into
16 clusters. Randomly selected
clusters for sample
20
CONVENIENT SAMPLING

- Use easily available/convenient


group to form a sample
WHAT IS IT? - Voluntary response sampling, self-
selected sampling…

21
III. SAMPLING AND NON-SAMPLING ERROR
1/ Sampling Error
- An error expected to occur when making statement
about the population that is based on the observations
contained in a sample taken from the population.

- The difference/deviation between the true (unknown)


value of a population parameter (mean, standard
deviation…) and its estimate, the sample statistic is the
sampling error.

- Sample error may be large due to unrepresentative


sample be selected.

- The only way to reduce sample error is to take larger


sample size
22
SAMPLING ERROR

23
III. SAMPLING AND NON-SAMPLING ERROR

1/ Non-Sampling Error
Selection Bias

An error occur
when there are
mistakes in the
acquisition of Measurement or
the data or due response bias
to the sample
observations
being selected
improperly.
Nonresponse Bias

24
 SELECTION BIAS

- Occur when the way the sample selected is


systematically excludes some part of the population
of interest.
- Example: A study on an issue related to the
population consisting of all residents of a city. The
methods of selecting individuals may exclude the
homeless or those without telephones.
- Selection bias also usually occurs when only
volunteers or self-selected individuals are used in a
study.

25
 MEASUREMENT OR RESPONSE BIAS

- Occur when the method of observation tends to


produce values that systematically differ from the true
value in some ways.
-This problem might happen due to:
 An improperly calibrated scale is used to weigh items
 Questions on a survey are worded in a way that tends
to influence the response.
 The appearance or the behavior of the interviewer,
the group or organization conducting the survey, the
tendency for people not to be completely honest
when asked about sensitive issues (sexual, illegal
activities…)
26
 NONRESPONSE BIAS

- Occur when responses are not obtained from some


individuals of the sample.
- As with selection bias, nonresponse bias can distort
results of the study.
- This problem might happen due to:
 An interviewer unable to contact a person listed
in the sample
 Sampled person refuses to respond for some
reasons

27
Case study
In summer 1936, the Literary Digest magazine wanted to
predict the next US president, just as they had successfully
done five times before.
They sent out postcards to 10 million Americans and then
announced that Alfred M. Landon, then governor of Kansas,
would gain 57% percent of the popular vote and, thus,
demolish Franklin D. Roosevelt, the incumbent president.
In fact, Roosevelt won by a landslide never before seen in
U.S. history. He garnered not the predicted 43%, but 62.5%
of the popular vote and all but 8 of 531 electoral votes.
The Digest never survived the debacle and folded shortly
thereafter.
What had gone wrong?
28
Case analysis

1 Sample selection

10 million people were


What had Digest done chosen from various sources:
with their poll? mailing list of subscribers,
club membership roster,
telephone directories,
automobile registration rolls

2 Response
percentage

Only 2.4 million of the 10


29 million questionnaire were
mailed back
End of chapter 2

THANK YOU!

30

You might also like