Types and Sources of Errors in Statistical Data
Types and Sources of Errors in Statistical Data
in Statistical Data
Types of Errors
In general, there are two types of errors:
a. non-sampling errors and
b. sampling errors.
It is important for a researcher to be aware of
these errors, in particular non-sampling errors, so
that they can be either minimised or eliminated
from the data collected.
Response
They result from the data that have been
requested, provided, received or recorded
incorrectly.
They may occur as a result of inefficiencies with
the questionnaire, the interviewer, the respondent
or the survey process.
a.
b.
Interviewer bias
10
c.
Respondent errors
11
d.
12
Non-Response
Non-response results when data is not collected
from respondents.
The proportion of these non-respondents in the
sample is called the non-response rate.
Non-response can be either total or partial.
Total non-response or unit non-response can
arise if a respondent cannot be contacted (because
the sampling frame is incomplete or out-of-dated) or
the respondent is not at home or is unable to
respond because of language difficulties or illness or
out rightly refuses to answer any questions or the
dwelling unit is vacant.
Other respondents may indicate that they simply
don't have the time to complete the interview or
survey form.
To put your footer here go to View > Header and Footer
13
Non-response - contd
When conducting surveys it is important to
document information on why a respondent has
not responded.
Partial non-response or item non-response
can occur when a respondent replies to some but
not all questions of the survey.
This can arise due to memory problems,
inadequate information or an inability to answer a
particular question/section of the questionnaire.
A respondent may refuse to answer if;
a.
they find questions particularly sensitive, or if
b.
they have been asked too many questions.
To put your footer here go to View > Header and Footer
14
Non-response - contd
To reduce non-response, the following approaches
can be used:
care should be taken in questionnaire design
through the use of simple questions.
pilot testing of the questionnaire.
explaining survey purposes and uses.
assuring confidentiality of responses.
public awareness activities including discussions
with key organisations and interest groups,
news releases, media interview and articles.
15
Processing
These occur at various stages of data processing
such as data cleaning, data capture and editing.
Data cleaning involves taking preliminary checks
before entering the data onto the processing
system.
Coder bias is usually a result of poor training or
incomplete instructions, variability in coder
performance and data entry errors.
16
Processing contd
Inadequate checking and quality management at
this stage can introduce data loss (where data is
not entered into the system) and data duplication
(where the same data is entered into the system
more than once) thus introducing errors in data.
To minimise these errors, processing staff should
be given adequate training, instructions and
realistic workloads.
17
18
19
20
21
Sampling error
Refer to the difference between the estimate
derived from a sample survey and the 'true' value
that would result if a census of the whole
population were taken under the same conditions.
These are errors that arise because data has been
collected from a part, rather than the whole of the
population.
Because of the above, sampling errors are
restricted to sample surveys only unlike nonsampling errors that can occur in both sample
surveys and censuses data.
22
23
24
25
26
27
Sources
http://www.nss.gov.au/nss/home.nsf/
SurveyDesignDoc/4354A8928428F834CA2
571AB002479CE?OpenDocument
http://www.statcan.ca/english/edu/pow
er/ch6/nonsampling/nonsampling.htm
http://www.statcan.ca/english/edu/pow
er/ch6/sampling/sampling.htm
28
29