0% found this document useful (0 votes)
15 views

CE5.3.2 Common Sources of Data Errors and ErrorChecking Techniques

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

CE5.3.2 Common Sources of Data Errors and ErrorChecking Techniques

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Common Sources of Data Errors and Error-

Checking Techniques
Sources of Data Errors
The table below displays the evaluation stages at which you might find different sources of data
errors. You may use the error-checking techniques and helpful hints in the next section to avoid
or correct these data errors as they occur.

Evaluation Sources of Errors


Stage Common Mistakes Example
• Missing questions A survey was administered in
• Unanswered questions both paper and electronic
• Incongruent or extra responses to a single formats. However, the paper
question survey was an older version that
Data collection included questions with seven
• Wrong box checked
possible response options,
• Response is not readable whereas the electronic survey
• Writing error had questions with five possible
• Response is out of expected range response items.
• Data incorrectly transferred from the
instrument
A survey was administered with
• Values entered in the wrong field a five-point rating scale, as well
• Inadvertent deletion or duplication during as a “not applicable” option.
Data entry and database handling When these survey responses
cleaning • Outliers and inconsistencies carried over from were entered into a database, the
the instrument scaled responses were correctly
• Values incorrectly entered coded as 1 through 5, but “not
applicable” was coded as 6.
• Values incorrectly changed during previous
data cleaning
• Data incorrectly extracted from the database
• Data incorrectly extracted or coded A survey was analyzed using a
• Inadvertent deletion or duplication during statistical software program.
analysis Several duplicate responses were
Data analysis • Outliers and inconsistencies carried over from identified. In an attempt to
the database remove the duplicates, the
• Data incorrectly extracted or coded duplicate as well as the original
responses were deleted.
• Sorting errors (spreadsheets)
• Data cleaning errors

Regional Educational Laboratory Central 1


Colorado • Kansas • Missouri • Nebraska • North Dakota • South Dakota • Wyoming
[email protected]
Error-Checking Techniques
Descriptive analysis: Calculate descriptive statistics, including the mean and the range, and
check that they are sensible. Does the mean seem reasonable? Is the range of values inside the
range of theoretically possible examples?

Double entry: Arrange for two or more people to enter the same data and then check for
discrepancies.

Logic check: Carefully review the electronically entered data to make sure that the answers to
the different questions make sense. For example, if teachers indicated that they were not
evaluated during the school year in one survey question, it would be illogical for them to rate
their satisfaction with evaluator feedback in their response to another question.

Spot-checking: Randomly select several participants and check their raw data against the data
entered in a spreadsheet, document, or database. If you find any errors, randomly select another
group and check their data in a similar manner. Examine the overall pattern of the data for data-
entry or coding mistakes. For example, in spot-checking the age variable for several participants,
you notice the value for age is 100. You know that this is an error because your sample consists
of high school students.

Helpful Tips
• Always keep a copy of original files. If an original file is modified, save it with a new
name in a different folder.
• Train data entry or data management staff.
• Develop instructions for data entry and data manipulation and establish data decision
rules.
• Keep a log of the data errors found and the changes made. The log should include
information about who found and corrected the error, and it should be easily accessible to
anyone working with the data.
• Always triple-check everything and screen data for errors frequently.
Note. Adapted from the following sources:

The Pell Institute. (n.d.). Enter, organize, & clean data. http://toolkit.pellinstitute.org/evaluation-
guide/analyze/enter-organize-clean-data/

United Nations High Commission for Refugees. (n.d.). Cleaning data. In Coordination toolkit.
http://www.coordinationtoolkit.org/wp-content/uploads/130813-Data-cleaning.pdf

Van den Broeck, J., Argeseanu Cunningham, S., Eeckels, R. & Herbst, K. (2005). Data cleaning: Detecting,
diagnosing, and editing data abnormalities. PLOS Medicine, 2(10), Article e267.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1198040/#!po=27.7778
This handout was prepared under Contract ED-IES-17-C-0005 by Regional Educational Laboratory Central, administered by
Marzano Research. The content does not necessarily reflect the views or policies of IES or the U.S. Department of Education, nor
does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.

Regional Educational Laboratory Central 2


Colorado • Kansas • Missouri • Nebraska • North Dakota • South Dakota • Wyoming
[email protected]

You might also like