Data Collection and Methods
Data Collection and Methods
While methods and aims may differ between fields, the overall process of data
collection remains largely the same. Before you begin collecting data, you need to
consider:
To collect high-quality data that is relevant to your purposes, follow these four steps.
Table of contents
Before you start the process of data collection, you need to identify exactly what you
want to achieve. You can start by writing a problem statement: what is the practical
or scientific issue that you want to address and why does it matter?
Next, formulate one or more research questions that precisely define what you want
to find out. Depending on your research questions, you might need to
collect quantitative or qualitative data:
Based on the data you want to collect, decide which method is best suited for your
research.
Carefully consider what method you will use to gather data that helps you directly
answer your research questions.
Step 3: Plan your data collection procedures
When you know which method(s) you are using, you need to plan exactly how you
will implement them. What procedures will you follow to make accurate
observations or measurements of the variables you are interested in?
For instance, if you’re conducting surveys or interviews, decide what form the
questions will take; if you’re conducting an experiment, make decisions about
your experimental design (e.g., determine inclusion and exclusion criteria).
Operationalization
Sometimes your variables can be measured directly: for example, you can collect
data on the average age of employees simply by asking for dates of birth. However,
often you’ll be interested in collecting data on more abstract concepts or variables
that can’t be directly observed.
Operationalization means turning abstract conceptual ideas into measurable
observations. When planning how you will collect data, you need to translate the
conceptual definition of what you want to study into the operational definition of
what you will actually measure.
You ask managers to rate their own leadership skills on 5-point scales
assessing the ability to delegate, decisiveness and dependability.
You ask their direct employees to provide anonymous feedback on the
managers regarding the same topics.
Using multiple ratings of a single concept can help you cross-check your data and
assess the test validity of your measures.
Sampling
You may need to develop a sampling plan to obtain data systematically. This
involves defining a population, the group you want to draw conclusions about, and
a sample, the group you will actually collect data from.
Your sampling method will determine how you recruit participants or obtain
measurements for your study. To decide on a sampling method you will need to
consider factors like the required sample size, accessibility of the sample, and
timeframe of the data collection.
Standardizing procedures
This means laying out specific step-by-step instructions so that everyone in your
research team collects data in a consistent way – for example, by conducting
experiments under the same conditions and using objective criteria to record and
categorize observations. This helps you avoid common research biases like omitted
variable bias or information bias.
This helps ensure the reliability of your data, and you can also use it to replicate the
study in the future.
Before beginning data collection, you should also decide how you will organize and
store your data.
If you are collecting data from people, you will likely need to anonymize and
safeguard the data to prevent leaks of sensitive information (e.g. names or
identity numbers).
If you are collecting data via interviews or pencil-and-paper formats, you will
need to perform transcriptions or data entry in systematic ways to minimize
distortion.
You can prevent loss of data by having an organization system that is routinely
backed up.
Step 4: Collect the data
Finally, you can implement your chosen methods to measure or observe the variables
you are interested in.
The closed-ended questions ask participants to rate their manager’s leadership skills
on scales from 1–5. The data produced is numerical and can be statistically analyzed
for averages and patterns.
The open-ended questions ask participants for examples of what the manager is
doing well now and what they can do better in the future. The data produced is
qualitative and can be categorized through content analysis for further insights.
To ensure that high quality data is recorded in a systematic way, here are some best
practices:
Record all relevant information as and when you obtain data. For example,
note down whether or how lab equipment is recalibrated during an
experimental study.
Double-check manual data entry for errors.
If you collect quantitative data, you can assess the reliability and validity to
get an indication of your data quality.
You can tailor data collection to your specific research aims (e.g.
understanding the needs of your consumers or user testing your
website)
You can control and standardize the process for high reliability and
validity (e.g. choosing appropriate measurements and sampling
methods)
However, there are also some drawbacks: data collection can be time-
consuming, labor-intensive and expensive. In some cases, it’s more efficient
to use secondary data that has already been collected by someone else, but the
data might be less reliable.
Reliability and validity are both about how well a method measures
something:
If you are doing experimental research, you also have to consider the internal
and external validity of your experiment.
What is operationalization?
For example, the concept of social anxiety isn’t directly observable, but it can
be operationally defined in terms of self-rating scores, behavioral avoidance
of crowded places, or physical anxiety symptoms in social situations.
Before collecting data, it’s important to consider how you will operationalize
the variables that you want to measure.
In mixed methods research, you use both qualitative and quantitative data
collection and analysis methods to answer your research question.