0% found this document useful (0 votes)
12 views

Module 3

The document discusses defining and measuring variables in research. It defines constructs and operational definitions, and explains how constructs are hypothetical variables that can be indirectly measured through observable behaviors. It also discusses validity and reliability in measurement, including different types of validity and reliability as well as scales of measurement and modalities of measurement.

Uploaded by

21-60448
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Module 3

The document discusses defining and measuring variables in research. It defines constructs and operational definitions, and explains how constructs are hypothetical variables that can be indirectly measured through observable behaviors. It also discusses validity and reliability in measurement, including different types of validity and reliability as well as scales of measurement and modalities of measurement.

Uploaded by

21-60448
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Republic of the Philippines

BATANGAS STATE UNIVERSITY JPLPC-Malvar


Malvar, Batangas
Tel. Nos.: (043) 778-2170/ (043) 406-0830 loc. 124
Website Address: http://www.batstate-u.edu.ph

Course Code PSY 315


Course Description Field Methods in Psychology
Week 3

MODULE 3: DEFINING AND MEASURING VARIABLES

Module Introduction
In this module, we examine Step 3 of the research process: how researchers define and
measure variables. We begin by considering different types of variables from simple, concrete
variables to more abstract variables. Then we will focus on the process of measurement with
particular attention to abstract variables that must be defined and measured using operational
definitions. Two criteria used to evaluate the quality of a measurement procedure—validity and
reliability—are discussed, and we follow with discussion of the scales of measurement, the
modes of measuring, and other aspects of measurement.

Intended Learning Objectives


At the end of the lesson, the student should be able to:
1. Define construct and operational definition and explain the role that constructs play in
theories and the limitations of operational definition.
2. Explain the validity and reliability of measurement and explain why and how it is
measured.
3. Compare and contrast the four scales of measurement (nominal, ordinal, interval, and
ratio) and identify examples of each.
4. Identify the three modalities of measurement and explain the strengths and
weaknesses of each.

Module Content
3.1 CONSTRUCTS AND OPERATIONAL DEFINITIONS

Theories and Constructs


In attempting to explain and predict behavior, scientists and philosophers often develop
theories that contain hypothetical mechanisms and intangible elements. Although these
mechanisms and elements cannot be seen and are only assumed to exist, we accept them as real
because they seem to describe and explain behaviors that we see. For example, a bright child
does poor work in school because she has low “motivation.” A kindergarten teacher may hesitate
to criticize a lazy child because it may injure the student’s “self-esteem.” But what is motivation,
and how do we know that it is low? What about self-esteem? How do we recognize poor self-
esteem or healthy self-esteem when we cannot see it in the first place? Many research variables,
particularly variables of interest to behavioral scientists, are in fact hypothetical entities created
from theory and speculation. Such variables are called constructs or hypothetical constructs.

Page | 1
Although constructs are hypothetical and intangible, they play very important roles in
behavioral theories. In many theories, constructs can be influenced by external stimuli and, in
turn, can influence external behaviors.

Operational Definitions
An operational definition is a procedure for indirectly measuring and defining a variable
that cannot be observed or measured directly. An operational definition specifies a measurement
procedure (a set of operations) for measuring an external, observable behavior and uses the
resulting measurements as a definition and a measurement of the hypothetical construct.
Researchers often refer to the process of using an operational definition as operationalizing a
construct.
In addition to using operational definitions as a basis for measuring variables, they also
can be used to define variables to be manipulated. For example, the construct “hunger” can be
operationally defined as the number of hours of food deprivation.

Limitations of Operational Definitions


The primary limitation of an operational definition is that there is not a one-to-one
relationship between the variable that is being measured and the actual measurements produced
by the operational definition. Consider for example, the familiar situation of an instructor
evaluating the students in a class. In this situation, the underlying variable is knowledge or
mastery of subject matter, and the instructor’s goal is to obtain a measure of knowledge for each
student. However, knowledge is a construct that cannot be directly observed or measured.
Therefore, instructors typically give students a task (such as an exam, an essay, or a set of
problems), and then measure how well students perform the task.

3.2 VALIDITY AND RELIABILITY OF MEASUREMENT

Consistency of a Relationship
To show the amount of consistency between two different measurements,
the two scores obtained for each person can be presented in a graph called a
scatter plot. In a scatter plot, the two scores for each person are represented as a
single point, with the horizontal position of the point determined by one score
and the vertical position determined by the second score.
Figure (a) shows an example of a consistent positive relationship
between two measurements. The relationship is described as positive because the
two measurements change together in the same direction. Therefore, people who
score high on the first measurement (toward the right of the graph) also tend to
score high on the second measurement (toward the top of the graph). Similarly,
people scoring low on one measurement also score low on the other.
Figure (b) shows an example of a consistent negative relationship. This
time the two measurements change in opposite directions so that people who
score high on one measurement tend to score low on the other. For example, we

could measure performance on a math test by counting the number of correct


answers or by counting the number of errors. These two measurements should
be negatively related.
Figure (c) shows two measurements that are not consistently related. In
this graph, some people who score high on one measurement also score high on
the second, but others who score high on the first measurement now score low
on the second. In this case, there is no consistent, predictable relationship
between the two measurements.

Validity of Measurement
The validity of a measurement procedure is the degree to which the measurement process
measures the variable that it claims to measure.

Page | 2
Researchers have developed several methods for assessing the validity of measurement.
Six of the more commonly used definitions of validity are as follows.

1. Face validity is an unscientific form of validity demonstrated when a measurement


procedure superficially appears to measure what it claims to measure.
2. Concurrent validity is demonstrated when scores obtained from a new measure are
directly related to scores obtained from an established measure of the same variable.
3. Predictive validity is demonstrated when scores obtained from a measure accurately
predict behavior according to a theory.
4. Construct validity requires that the scores obtained from a measurement procedure
behave exactly the same as the variable itself. Construct validity is based on many
research studies that use the same measurement procedure and grows gradually as each
new study contributes more evidence.
5. Convergent validity is demonstrated by a strong relationship between the scores
obtained from two (or more) different methods of measuring the same construct.
6. Divergent validity is demonstrated by showing little or no relationship between the
measurements of two different constructs.

Reliability of Measurement
The reliability of a measurement procedure is the stability or consistency of the
measurement. If the same individuals are measured under the same conditions, a reliable
measurement procedure produces identical (or nearly identical) measurements.

Types and Measures of Reliability


 Successive measurements: The reliability estimate obtained by comparing the scores
obtained from two successive measurements is commonly called test-retest reliability.
A researcher may use exactly the same measurement procedure for the same group of
individuals at two different times. Or a researcher may use modified versions of the
measurement instrument (such as alternative versions of an IQ test) to obtain two
different measurements for the same group of participants. When different versions of the
instrument are used for the test and the retest, the reliability measure is often called
parallel-forms reliability. Typically, reliability is determined by computing a correlation
to measure the consistency of the relationship between the two sets of scores
 Simultaneous measurements: When measurements are obtained by direct observation
of behaviors, it is common to use two or more separate observers who simultaneously
record measurements. For example, two psychologists may watch a group of preschool
children and observe social behaviors. Each individual records (measures) what he or she
observes, and the degree of agreement between the two observers is called inter-rater
reliability. Inter-rater reliability can be measured by computing the correlation between
the scores from the two observers or by computing a percentage of agreement between
the two observers.
 Internal consistency: Often, a complex construct such as intelligence or personality is
measured using a test or questionnaire consisting of multiple items. The idea is that no
single item or question is sufficient to provide a complete measure of the construct. A
common example is the use of exams that consist of multiple items (questions or
problems) to measure performance in an academic course. The final measurement for
each individual is then determined by adding or averaging the responses across the full
set of items. A basic assumption in this process is that each item (or group of items)
measures a part of the total construct. If this is true, then there should be some
consistency between the scores for different items or different groups of items. To
measure the degree of consistency, researchers commonly split the set of items in half
and compute a separate score for each half. The degree of agreement between the two
scores is then evaluated, usually with a correlation. This general process results in a
measure of split-half reliability.

Page | 3
The Relationship between Reliability and Validity
A measure cannot be valid unless it is reliable, but a measure can be reliable without
being valid.
Although reliability and validity are both criteria for evaluating the quality of a
measurement procedure, these two factors are partially related and partially independent. They
are related to each other in that reliability is a prerequisite for validity; that is, a measurement
procedure cannot be valid unless it is reliable. If we measure your IQ twice and obtain
measurements of 75 and 160, not only are the measurements unreliable but we also have no idea
what your IQ actually is. The huge discrepancy between the two measurements is impossible if
we are truly measuring intelligence. Therefore, we must conclude that there is so much error in
the measurements that the numbers themselves have no meaning.
On the other hand, it is not necessary for a measurement to be valid for it to be reliable.
For example, we could measure your height and claim that it is a measure of intelligence.
Although this is a foolish and invalid method for defining and measuring intelligence, it would
be very reliable, producing consistent scores from one measurement to the next. Thus, the
consistency of measurement is no guarantee of validity.

3.3 SCALES OF MEASUREMENT

The Nominal Scale


The categories that make up a nominal scale simply represent qualitative (not
quantitative) differences in the variable measured. The categories have different names but are
not related to each other in any systematic way. For example, if you were measuring academic
majors for a group of college students, the categories would be art, chemistry, English, history,
psychology, and so on. Each student would be placed in a category according to his or her major.
Measurements from a nominal scale allow us to determine whether two individuals are the same
or different, but they do not permit any quantitative comparison.

The Ordinal Scale


The categories that make up an ordinal scale have different names and are organized in
an ordered series. Often, an ordinal scale consists of a series of ranks (first, second, third, and so
on) like the order of finish in a horse race. Occasionally, the categories are identified by verbal
labels such as small, medium, and large drink sizes at a fast-food restaurant. In either case, the
fact that the categories form an ordered sequence means that there is a directional relationship
between the categories. With measurements from an ordinal scale, we can determine whether
two individuals are different, and we can determine the direction of difference. However, ordinal
measurements do not allow us to determine the magnitude of the difference between the two
individuals. For example, a large coffee is bigger than a small coffee but we do not know how
much bigger. Other examples of ordinal scales are socioeconomic class (upper, middle, and
lower) and T-shirt sizes (small, medium, and large).

Interval and Ratio Scales


The characteristic that differentiates interval and ratio scales is the zero point. The
distinguishing characteristic of an interval scale is that it has an arbitrary zero point. That is, the
value 0 is assigned to a particular location on the scale simply as a matter of convenience or
reference. Specifically, a value of 0 does not indicate the total absence of the variable being
measured. For example, a temperature of 0 degrees Fahrenheit does not mean that there is no
temperature, and it does not prohibit the temperature from going even lower. Interval scales with
an arbitrary zero point are fairly rare.
A ratio scale, on the other hand, is characterized by a zero point that is not an arbitrary
location. Instead, the value 0 on a ratio scale is a meaningful point representing none (a complete
absence) of the variable being measured. The existence of an absolute, nonarbitrary zero point
means that we can measure the absolute amount of the variable; that is, we can measure the
distance from 0. This makes it possible to compare measurements in terms of ratios. For
example, a glass with 8 ounces of water (8 more than 0) has twice as much as a glass with 4
ounces (4 more than 0). With a ratio scale, we can measure the direction and magnitude of the
difference between measurements and describe differences in terms of ratios. Ratio scales are

Page | 4
quite common and include physical measures, such as height and weight, as well as variables,
such as reaction time or number of errors on a test.

Selecting a Scale of Measurement


One obvious factor that differentiates the four types of measurement scales is their ability
to compare different measurements. A nominal scale can tell us only that a difference exists. An
ordinal scale tells us the direction of the difference (which is more and which is less). With an
interval scale, we can determine the direction and the magnitude of a difference. Measurements
from a ratio scale allow us to determine the direction, the magnitude, and the ratio of the
difference. The ability to compare measurements has a direct effect on the ability to describe
relationships between variables. For example, when a research study involves measurements
from nominal scales, the results of the study can establish the existence of only a qualitative
relationship between variables. With nominal scales, we can determine whether a change in one
variable is accompanied by a change in the other variable, but we cannot determine the direction
of the change (increase or a decrease), and we cannot determine the magnitude of the change. An
interval or a ratio scale, on the other hand, allows a much more sophisticated description of a
relationship. For example, we could determine that a 1-point increase in one variable (such as
drug dose) results in a 4-point decrease in another variable (such as heart rate).

3.4 MODALITIES OF MEASUREMENT

Self-Report Measures
The primary advantage of a self-report measure is that it is probably the most direct way
to assess a construct. Each individual is in a unique position of self-knowledge and self-
awareness; presumably, no one knows more about the individual’s fear than the individual. Also,
a direct question and its answer have more face validity than measuring some other response that
theoretically is influenced by fear. On the negative side, however, it is very easy for participants
to distort self-report measures. A participant may deliberately lie to create a better self-image, or
a response may be influenced subtly by the presence of a researcher, the wording of the
questions, or other aspects of the research situation. When a participant distorts self-report
responses, the validity of the measurement is undermined.

Physiological Measures
Physiological measures involve brain-imaging techniques such as positron emission
tomography (PET) scanning and magnetic resonance imaging (MRI). These techniques allow
researchers to monitor activity levels in specific areas of the brain during different kinds of
activity. For example, researchers studying attention have found specific areas of the brain where
activity increases as the complexity of a task increases and more attention is required (Posner &
Badgaiyan, 1998). Other research has used brain imaging to determine which areas of the brain
are involved in different kinds of memory tasks (Wager & Smith, 2003) or in the processing of
information about pain (Wager et al., 2004).
One advantage of physiological measures is that they are extremely objective. The
equipment provides accurate, reliable, and well-defined measurements that are not dependent on
subjective interpretation by either the researcher or the participant. One disadvantage of such
measures is that they typically require equipment that may be expensive or unavailable. In
addition, the presence of monitoring devices creates an unnatural situation that may cause
participants to react differently than they would under normal circumstances. A more important
concern with physiological measures is whether they provide a valid measure of the construct.
Heart rate, for example, may be related to fear, but heart rate is not the same thing as fear.
Increased heart rate may be caused by anxiety, arousal, embarrassment, or exertion as well as by
fear. Can we be sure that measurements of heart rate are, in fact, measurements of fear?

Behavioral Measures
Behavioral measures provide researchers with a vast number of options, making it
possible to select the behaviors that seem to be best for defining and measuring the construct. For
example, the construct “mental alertness” could be operationally defined by behaviours such as
reaction time, reading comprehension, logical reasoning ability, or ability to focus attention.

Page | 5
Depending on the specific purpose of a research study, one of these measures probably is more
appropriate than the others. In clinical situations in which a researcher works with individual
clients, a single construct such as depression may reveal itself as a separate, unique behavioral
problem for each client. In this case, the clinician can construct a separate, unique behavioral
definition of depression that is appropriate for each patient.
In other situations, the behavior may be the actual variable of interest and not just an
indicator of some hypothetical construct. For a school psychologist trying to reduce disruptive
behavior in the classroom, it is the actual behavior that the psychologist wants to observe and
measure. In this case, the psychologist does not use the overt behavior as an operational
definition of an intangible construct but rather simply studies the behavior itself.
On the negative side, a behavior may be only a temporary or situational indicator of an
underlying construct. A disruptive student may be on good behavior during periods of
observation or shift the timing of negative behaviors from the classroom to the school bus on the
way home. Usually, it is best to measure a cluster of related behaviors rather than rely on a single
indicator. For example, in response to therapy, a disruptive student may stop speaking out of turn
in the classroom but replace this specific behavior with another form of disruption. A complete
definition of disruptive behavior would require several behavioral indicators.

End of Module Assessment


Online Activities/Assignments – These are integral part of the course. This may come in
various tasks such as group work, individual activity, research work, extended reading and the
like. This will provide opportunities for the students to transfer the concepts they have learned in
class to a more concrete situation and to equally participate in class discussion.

Learning Reference
Gravetter, F.J. & Forzano, L.B. (2018). Research Methods for the Behavioral Sciences, (6th Ed.).
Boston, MA, USA: Cengage.

Page | 6

You might also like