0% found this document useful (0 votes)
5 views

Lesson 6 Measurement

The document discusses measurement and scaling in research, emphasizing the importance of accurately describing properties through reliable and valid numerical assignments. It outlines four levels of scale measurement: nominal, ordinal, interval, and ratio, each with varying degrees of informational value and applicability in research. Additionally, it highlights sources of measurement errors and criteria for good measurement, including reliability, validity, and sensitivity.

Uploaded by

Saadie Essie
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lesson 6 Measurement

The document discusses measurement and scaling in research, emphasizing the importance of accurately describing properties through reliable and valid numerical assignments. It outlines four levels of scale measurement: nominal, ordinal, interval, and ratio, each with varying degrees of informational value and applicability in research. Additionally, it highlights sources of measurement errors and criteria for good measurement, including reliability, validity, and sensitivity.

Uploaded by

Saadie Essie
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

MEASUREMENT & SCALING IN RESEARCH

Introduction

Measurement is the process of describing some property of a phenomenon of interest, usually by


assigning numbers in a reliable and valid way. The numbers convey information about the
property being measured. When numbers are used, the researcher must have a rule for assigning
a number to an observation in a way that provides an accurate description. The decision of what
concepts to be measured is guided by the decision statement, corresponding research questions,
and research hypotheses. A concept can is a generalized idea that represents something of
meaning. Concepts such as age, sex, education, and number of children are relatively concrete
properties which are easy to define and measure. However, other concepts such as loyalty,
personality trust, culture, satisfaction, value etc are more abstract and thus difficult to define and
measure. Concepts are measured through a process known as operationalization which involves
identifying scales that correspond to variance in the concept. The scales provide a range of
values that correspond to different values in the concept being measured. Simply scales provide
correspondence rules that indicate that a certain value in the scale corresponds to some true value
of a concept.

Levels of Scale Measurement

Business researchers use many scales or number systems. Not all scales capture the same
richness in a measure and yet not all concepts require a rich measure. Traditionally, the level of
scale measurement is seen as important because it determines the mathematical manipulations
that that can be done in a data set. The four levels or types of scale measurement are nominal,
ordinal, interval, and ratio level scales.

Nominal scale: is the most basic and least powerful measurement scale and involves labeling or
classifying of objects by assigning numbers values to them. The values are not ordered in any
sense but are just a convenient method of identification with no quantitative meaning. The value
can be, but does not have to be, a number because no quantities are being represented. In this
sense, a nominal scale is truly a qualitative scale. Nominal scale only provides a convenient way
of tracking objects or events since it indicates no order or distance relationship between the
objects. It simply describes differences between variables by assigning them to different
categories. Nominal scales have weak informational value in that they are not able to capture and
relationships and attributes that may be known about the objects being measured. They are the
least precise and therefore less suitable in research

Ordinal scale: this is the lowest level of ordered measurement scale. It allows things to be
arranged in order based on how much of some concept they possess but without any attempts to
make the intervals of the scale equal in terms some logical rule. It simply ranks the objects being
measured from highest to lowest often with no absolute values and the difference between
adjacent ranks may not necessarily be equal. Its informational value is restricted to equality and
difference in ranks (greater than or less than) without necessarily the quantity of difference
between two ranks.
Interval scale: interval scale is an ordered measurement scale where the intervals are adjusted in
terms of some established rule that makes the units in every interval equal. This scale may have
some arbitrary zero but suffers from lack of an absolute zero (unique origin) in that it cannot
measure the complete absence of a trait or a characteristic. Interval scales however have more
informational value than ordinal scales since they provide for the equality of the intervals and
thus enable more powerful statistical manipulations.

Ratio scale: ratio scales have an absolute zero of measurement and represents the actual amounts
of variables. They have very high informational values and allow all statistical manipulations and
techniques that can be used on real numbers. Multiplication and division can be used with this
scale but not with any other scale. They are the most suitable in research because of their high
levels of precision

Sources of Errors in Measurement

Ideally, measurement should be precise and unambiguous. However, this is more often not the
case and therefore a researcher should be aware of some of the sources of errors in research
measurement which include:

Respondent: sometimes the respondent may be reluctant to express the true feelings or may lack
knowledge in the aspects being measurements but fail to admit his ignorance which may lead to
wrong information being conveyed. Other factors like fatigue, boredom, anxiety etc may also
limit the ability of the respondent to give accurate and full information

Situation: Situational factors also inhibit correct measurement e.g. any condition straining the
interviewer-respondent rapport may distort the accuracy of the information captured

Measurer: The interviewer can distort the responses by rewording or even reordering the
questions. His behavior, looks, appearance, communication and etiquette may encourage or
discourage certain responses from respondents

Instrument: errors may also arise because of defective measurement instruments e.g. use of
technical language in instrument, poor printing, inadequate space for responses; response choice
omission etc may result in measurement errors

Criteria for Good Measurement

The three major criteria for evaluating measurements are reliability, validity, and sensitivity.

a. Reliability

Reliability is an indicator of a measure’s internal consistency. Consistency is the key to


understanding reliability. A measure is reliable when different attempts at measuring something
converge on the same result. The concept of reliability revolves around consistency.

Assessing Internal Consistency


Internal consistency represents a measure’s homogeneity. The set of items that make up a
measure are referred to as a battery of scale items. Internal consistency of a multiple-item
measure can be measured by correlating scores on subsets of items making up a scale. The split-
half method of checking reliability is performed by taking half the items from a scale (for
example, odd-numbered items) and checking them against the results from the other half (even-
numbered items). The two scale halves should produce similar scores and correlate highly.

The problem with split-half method is determining the two halves. Coefficient alpha (α) is the
most commonly applied estimate of a multiple-item scale’s reliability. α - coefficient represents
internal consistency by computing the average of all possible split-half reliabilities for a
multiple-item scale. The coefficient demonstrates whether or not the different items converge.
Coefficient alpha ranges in value from 0, meaning no consistency, to 1, meaning complete
consistency (all items yield corresponding values). Generally, scales with a coefficient of
between 0.80 and 0.95 are considered to have very good reliability while scales with a
coefficient of between 0.70 and 0.80 are considered to have good reliability, and an α value
between 0.60 and 0.70 indicates fair reliability. When the coefficient α is below 0.6, the scale has
poor reliability.

Test - Retest Reliability

The test-retest method of determining reliability involves administering the same scale or
measure to the same respondents at two separate times to test for stability. If the measure is
stable over time, the test, administered under the same conditions each time, should obtain
similar results. Test-retest reliability represents a measure’s repeatability.

Measures of test-retest reliability pose two problems that are common to all longitudinal studies.

First, the pre-measure, or first measure, may sensitize the respondents to their participation in a
research project and subsequently influence the results of the second measure (you may recall)

b. Validity

Good measures should be both consistent and accurate. Reliability represents how consistent a
measure is, in that the different attempts at measuring the same thing converge on the same
point. Accuracy deals more with how a measure assesses the intended concept. Validity is the
accuracy of a measure or the extent to which a score truthfully represents a concept. In other
words, are we accurately measuring what we think we are measuring?

Establishing Validity

The four basic approaches to establishing validity are face validity, content validity, criterion
validity, and construct validity. Face validity refers to the subjective agreement among
professionals that a scale logically reflects the concept being measured. Do the test items look
like they make sense given a concept’s definition? When an inspection of the test items
convinces experts that the items match the definition, the scale is said to have face validity.
Content validity refers to the degree that a measure covers the domain of interest. Do the items
capture the entire scope, but not go beyond, the concept we are measuring? Criterion validity
addresses the question, “How well does my measure work in practice?” Because of this, criterion
validity is sometimes referred to as pragmatic validity. In other words, is my measure practical?
Criterion validity may be classified as either concurrent validity or predictive validity depending
on the time sequence in which the new measurement scale and the criterion measure are
correlated. If the new measure is taken at the same time as the criterion measure and is shown to
be valid, then it has concurrent validity. Predictive validity is established when a new measure
predicts a future event. The two measures differ only on the basis of a time dimension—that is,
the criterion measure is separated in time from the predictor measure.

Construct validity exists when a measure reliably measures and truthfully represents a unique
concept. Construct validity consists of several components, including

§ Face validity

§ Content validity

§ Criterion validity

§ Convergent validity

§ Discriminant validity

Convergent validity requires that concepts that should be related or are indeed related. Related
concepts should display a significant correlation (convergent validity), but not to be so highly
correlated (above 0.75) that they are not independent concepts (discriminant validity).
Multivariate procedures like factor analysis can be useful in establishing construct validity.

c. Sensitivity

Sensitivity refers to an instrument’s ability to accurately measure variability in a concept. A


dichotomous response category, such as “agree or disagree,” does not allow the recording of
subtle attitude changes. A more sensitive measure with numerous categories on the scale may be
needed. For example, adding “strongly agree,” “mildly agree,” “neither agree nor disagree,”
“mildly disagree,” and “strongly disagree” will increase the scale’s sensitivity.

The sensitivity of a scale based on a single question or single item can also be increased by
adding questions or items. In other words, because composite measures allow for a greater range
of possible scores, they are more sensitive than single-item scales. Thus, sensitivity is generally
increased by adding more response points or adding scale items.

You might also like