The Importance of Ensuring Accurate and Appropriate Data Collection
The Importance of Ensuring Accurate and Appropriate Data Collection
is the process of gathering and measuring information on variables of interest, in an established systematic fashion
that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. The data collection
component of research is common to all fields of study including physical and social sciences, humanities, business,
etc. While methods vary by discipline, the emphasis on ensuring accurate and honest collection remains the same.
Regardless of the field of study or preference for defining data (quantitative, qualitative), accurate data collection is
essential to maintaining the integrity of research. Both the selection of appropriate data collection instruments
(existing, modified, or newly developed) and clearly delineated instructions for their correct use reduce the likelihood
of errors occurring.
While the degree of impact from faulty data collection may vary by discipline and the nature of investigation, there is
the potential to cause disproportionate harm when these research results are used to support public policy
recommendations.
The primary rationale for preserving data integrity is to support the detection of errors in the data collection process,
whether they are made intentionally (deliberate falsifications) or not (systematic or random errors).
Most, Craddick, Crawford, Redican, Rhodes, Rukenbrod, and Laws (2003) describe ‘quality assurance’ and ‘quality
control’ as two approaches that can preserve data integrity and ensure the scientific validity of study results. Each
approach is implemented at different points in the research timeline (Whitney, Lind, Wahl, 1998):
1. Quality assurance - activities that take place before data collection begins
2. Quality control - activities that take place during and after data collection
Quality Assurance
Since quality assurance precedes data collection, its main focus is 'prevention' (i.e., forestalling problems with data
collection). Prevention is the most cost-effective activity to ensure the integrity of data collection. This proactive
measure is best demonstrated by the standardization of protocol developed in a comprehensive and detailed
procedures manual for data collection. Poorly written manuals increase the risk of failing to identify problems and
errors early in the research endeavor. These failures may be demonstrated in a number of ways:
Uncertainty about the timing, methods, and identify of person(s) responsible for reviewing data
Partial listing of items to be collected
Vague description of data collection instruments to be used in lieu of rigorous step-by-step instructions on
administering tests
Failure to identify specific content and strategies for training or retraining staff members responsible for data
collection
Obscure instructions for using, making adjustments to, and calibrating data collection equipment (if
appropriate)
No identified mechanism to document changes in procedures that may evolve over the course of the
investigation .
An important component of quality assurance is developing a rigorous and detailed recruitment and training plan.
Implicit in training is the need to effectively communicate the value of accurate data collection to trainees (Knatterud,
Rockhold, George, Barton, Davis, Fairweather, Honohan, Mowery, O'Neill, 1998). The training aspect is particularly
important to address the potential problem of staff who may unintentionally deviate from the original protocol. This
phenomenon, known as ‘drift’, should be corrected with additional training, a provision that should be specified in the
procedures manual.
Given the range of qualitative research strategies (non-participant/ participant observation, interview, archival, field
study, ethnography, content analysis, oral history, biography, unobtrusive research) it is difficult to make generalized
statements about how one should establish a research protocol in order to facilitate quality assurance. Certainly,
researchers conducting non-participant/participant observation may have only the broadest research questions to
guide the initial research efforts. Since the researcher is the main measurement device in a study, many times there
are little or no other data collecting instruments. Indeed, instruments may need to be developed on the spot to
accommodate unanticipated findings.
Quality Control
While quality control activities (detection/monitoring and action) occur during and after data collection, the details
should be carefully documented in the procedures manual. A clearly defined communication structure is a necessary
pre-condition for establishing monitoring systems. There should not be any uncertainty about the flow of information
between principal investigators and staff members following the detection of errors in data collection. A poorly
developed communication structure encourages lax monitoring and limits opportunities for detecting errors.
Detection or monitoring can take the form of direct staff observation during site visits, conference calls, or regular and
frequent reviews of data reports to identify inconsistencies, extreme values or invalid codes. While site visits may not
be appropriate for all disciplines, failure to regularly audit records, whether quantitative or quantitative, will make it
difficult for investigators to verify that data collection is proceeding according to procedures established in the manual.
In addition, if the structure of communication is not clearly delineated in the procedures manual, transmission of any
change in procedures to staff members can be compromised
Quality control also identifies the required responses, or ‘actions’ necessary to correct faulty data collection practices
and also minimize future occurrences. These actions are less likely to occur if data collection procedures are vaguely
written and the necessary steps to minimize recurrence are not implemented through feedback and education
(Knatterud, et al, 1998)
In the social/behavioral sciences where primary data collection involves human subjects, researchers are taught to
incorporate one or more secondary measures that can be used to verify the quality of information being collected
from the human subject. For example, a researcher conducting a survey might be interested in gaining a better
insight into the occurrence of risky behaviors among young adult as well as the social conditions that increase the
likelihood and frequency of these risky behaviors.
To verify data quality, respondents might be queried about the same information but asked at different points of the
survey and in a number of different ways. Measures of ‘ Social Desirability’ might also be used to get a measure of
the honesty of responses. There are two points that need to be raised here, 1) cross-checks within the data collection
process and 2) data quality being as much an observation-level issue as it is a complete data set issue. Thus, data
quality should be addressed for each individual measurement, for each individual observation, and for the entire data
set.
Each field of study has its preferred set of data collection instruments. The hallmark of laboratory sciences is the
meticulous documentation of the lab notebook while social sciences such as sociology and cultural anthropology may
prefer the use of detailed field notes. Regardless of the discipline, comprehensive documentation of the collection
process before, during and after the activity is essential to preserving data integrity.
References:
Knatterud.,G.L., Rockhold, F.W., George, S.L., Barton, F.B., Davis, C.E., Fairweather, W.R., Honohan, T., Mowery,
R, O’Neill, R. (1998). Guidelines for quality assurance in multicenter trials: a position paper. Controlled Clinical Trials,
19:477-493.
Most, .M.M., Craddick, S., Crawford, S., Redican, S., Rhodes, D., Rukenbrod, F., Laws, R. (2003). Dietary quality
assurance processes of the DASH-Sodium controlled diet study. Journal of the American Dietetic Association,
103(10): 1339-1346.
Whitney, C.W., Lind, B.K., Wahl, P.W. (1998). Quality assurance and quality control in longitudinal
studies. Epidemiologic Reviews, 20(1): 71-80.