3I's 4th Quarter Notes
3I's 4th Quarter Notes
Statistics
is a branch of mathematics dealing with the collection, analysis, presentation,
interpretation, and conclusion of data, while biostatistics is a branch of statistics, where
statistical techniques are used on biomedical data to reach a conclusion. Measurement scale
(data type) is an important part of data collection, analysis, and presentation.
In the data collection, the type of questionnaire and the data recording tool differ according
to the data types. Similarly, in the data analysis, statistical tests or methods differ from one
data type to another. Data presentation is an important step to communicate our
information and findings to the audience and readers in an effective way. If done properly,
they not only reduce word count but also convey an important message in a meaningful way
so that the readers can grasp it easily. There are various tabulation and graphical methods
used to present the data, which are not possible without proper knowledge of data types.
Data are a collection of facts such as values or measurements. It can be numbers, words,
measurements, observations, or even just descriptions of things. Basically, data are two
types: constant and variable. Constant is a situation or value thatdoes not change, while a
characteristic, number, or quantity that increases or decreases over time or takes different
values in different situations is called variable. Due to unchangeable property, constant is
not used and only variable is used for summary measures and analysis.
Quantitative variable is the data that show some quantity through numerical value.
Quantitative data are the numeric variables (e.g., how many, how much, or how often). Age,
blood pressure, body temperature, hemoglobin level, and serum creatinine level are some
examples of quantitative data. It is also called metric data. It has two types: discrete and
continuous. Discrete variable is the quantitative data, but its values cannot be expressed or
presented in the form of a decimal. For example, number of males, number of females,
number of patients, and family size are data that cannot be expressed in decimal points.
Continuous data are measured in values and can be quantified and presented in decimals.
Age, height, weight, body mass index, serum creatinine, heart rate, systolic blood pressure,
and diastolic blood pressure are some examples. Data collection is the process of gathering
and measuring information on variables of interest in an established systematic fashion that
enables one to answer stated research questions, test hypotheses, and evaluate outcomes.
The data collection component of research is common to all fields of study including physical
and social sciences, humanities, business, etc. While methods vary by discipline, the
emphasis on ensuring accurate and honest collection remains the same. The goal for all data
collection is to capture quality evidence that then translates to rich data analysis and allows
the building of a convincing and credible answer to questions that have been posed. Data
collection is one of the most important stages in conducting a research. You can have the
best research design in the world but if you cannot collect the required data you will not be
able to complete your project. Data collection is a very demanding job which needs
thorough planning, hard work, patience, perseverance and more to be able to complete the
task successfully. Data collection starts with determining what kind of data required
followed by the selection of a sample from a certain population. After that, you need to use
a certain instrument to collect the data from the selected sample. Let us now take a closer
look on quantitative data. Data collection is the process of gathering information on
variables of interest from a sample of research participants. There are two types of data
collection:
1. Primary data collection refers to data that is collected from research participants directly
by the investigators of a study and the data is used for that study.
Below are some of the sources of primary data:
a. Experiments require an artificial or natural setting in which to perform logical study to
collect data. Experiments are more suitable for medicine, psychological studies, nutrition,
and for other scientific studies. In experiments, the experimenter must keep control over the
influence of any extraneous variable on the results.
b. Survey is the most commonly used method in social sciences, management, marketing,
and psychology to some extent. Surveys can be conducted in different methods.
c. Questionnaire is the most commonly used method in survey.
Questionnaires are list of questions either open-ended or close-ended for which the
respondents give answers. Questionnaire can be conducted via telephone, mail, live in a
public area, or in an institute, through electronic mail or through online platforms and other
methods.
d. Interview is a face-to-face conversation with the respondent. In interview the main
problem arises when the respondent deliberately hides information otherwise it is an in-
depth source of information. The interviewer can not only record the statements the
interviewee speaks.
2. Secondary data collection refers to data that is collected by investigators from research
papers that are already published online. Secondary data is used by these investigators in a
secondary research study (e.g., review of primary research).
The following are some examples of collecting secondary data:
• Books
• Records
• Biographies
• Newspapers
• Published censuses or
other statistical data
• Data archives
• Internet articles
• Research articles by other
researchers (journals)
• Databases, etc.
b. Use of different question types. To collect quantitative data, close-ended questions have
to be used in a survey. They can be a mix of multiple question types including multiple-
choice questions like semantic differential scale questions, rating scale questions etc. That
can help collect data that can be analyzed and made sense of.
c. Survey distribution and survey data collection. In the above, we have seen the process of
building a survey along with the survey design to collect quantitative data. Survey
distribution to collect data is the other important aspect of the survey process. There are
different ways of survey distribution. Some of the most commonly used methods are:
➢ e-mail
➢ sample size
➢ embedding a survey
➢ social distribution
2. One-on-one Interviews. This quantitative data collection method was also traditionally
conducted face-to-face but has shifted to telephonic and online platforms. Interviews offer a
marketer the opportunity to gather extensive data from the participants. Quantitative
interviews are immensely structured and play a key role in collecting information. There are
three major sections of these online interviews:
a. face-to-face interviews
b. online or telephonic interviews
c. computer assisted personal interview
Step 1: Define the aim of your research. Before you start the process of data collection, you
need to identify exactly what you want to achieve. You can start by writing a problem
statement: what is the practical or scientific issue that you want to address and why does it
matter?
Step 2: Develop operational definitions and procedures. What are we
measuring? How will it be measured? Who will measure it? Having clarity in these questions
is of utmost importance. Often, we will employ sampling in which case we need to define a
sampling plan.
Step 3: Choose more than one data collection technique. There is no “best” tool. Do not let
the tool drive your work but rather choose the right tool to address the evaluation question.
Step 4: Begin to collect your data.