Lesson 6 Measurement
Lesson 6 Measurement
Introduction
Business researchers use many scales or number systems. Not all scales capture the same
richness in a measure and yet not all concepts require a rich measure. Traditionally, the level of
scale measurement is seen as important because it determines the mathematical manipulations
that that can be done in a data set. The four levels or types of scale measurement are nominal,
ordinal, interval, and ratio level scales.
Nominal scale: is the most basic and least powerful measurement scale and involves labeling or
classifying of objects by assigning numbers values to them. The values are not ordered in any
sense but are just a convenient method of identification with no quantitative meaning. The value
can be, but does not have to be, a number because no quantities are being represented. In this
sense, a nominal scale is truly a qualitative scale. Nominal scale only provides a convenient way
of tracking objects or events since it indicates no order or distance relationship between the
objects. It simply describes differences between variables by assigning them to different
categories. Nominal scales have weak informational value in that they are not able to capture and
relationships and attributes that may be known about the objects being measured. They are the
least precise and therefore less suitable in research
Ordinal scale: this is the lowest level of ordered measurement scale. It allows things to be
arranged in order based on how much of some concept they possess but without any attempts to
make the intervals of the scale equal in terms some logical rule. It simply ranks the objects being
measured from highest to lowest often with no absolute values and the difference between
adjacent ranks may not necessarily be equal. Its informational value is restricted to equality and
difference in ranks (greater than or less than) without necessarily the quantity of difference
between two ranks.
Interval scale: interval scale is an ordered measurement scale where the intervals are adjusted in
terms of some established rule that makes the units in every interval equal. This scale may have
some arbitrary zero but suffers from lack of an absolute zero (unique origin) in that it cannot
measure the complete absence of a trait or a characteristic. Interval scales however have more
informational value than ordinal scales since they provide for the equality of the intervals and
thus enable more powerful statistical manipulations.
Ratio scale: ratio scales have an absolute zero of measurement and represents the actual amounts
of variables. They have very high informational values and allow all statistical manipulations and
techniques that can be used on real numbers. Multiplication and division can be used with this
scale but not with any other scale. They are the most suitable in research because of their high
levels of precision
Ideally, measurement should be precise and unambiguous. However, this is more often not the
case and therefore a researcher should be aware of some of the sources of errors in research
measurement which include:
Respondent: sometimes the respondent may be reluctant to express the true feelings or may lack
knowledge in the aspects being measurements but fail to admit his ignorance which may lead to
wrong information being conveyed. Other factors like fatigue, boredom, anxiety etc may also
limit the ability of the respondent to give accurate and full information
Situation: Situational factors also inhibit correct measurement e.g. any condition straining the
interviewer-respondent rapport may distort the accuracy of the information captured
Measurer: The interviewer can distort the responses by rewording or even reordering the
questions. His behavior, looks, appearance, communication and etiquette may encourage or
discourage certain responses from respondents
Instrument: errors may also arise because of defective measurement instruments e.g. use of
technical language in instrument, poor printing, inadequate space for responses; response choice
omission etc may result in measurement errors
The three major criteria for evaluating measurements are reliability, validity, and sensitivity.
a. Reliability
The problem with split-half method is determining the two halves. Coefficient alpha (α) is the
most commonly applied estimate of a multiple-item scale’s reliability. α - coefficient represents
internal consistency by computing the average of all possible split-half reliabilities for a
multiple-item scale. The coefficient demonstrates whether or not the different items converge.
Coefficient alpha ranges in value from 0, meaning no consistency, to 1, meaning complete
consistency (all items yield corresponding values). Generally, scales with a coefficient of
between 0.80 and 0.95 are considered to have very good reliability while scales with a
coefficient of between 0.70 and 0.80 are considered to have good reliability, and an α value
between 0.60 and 0.70 indicates fair reliability. When the coefficient α is below 0.6, the scale has
poor reliability.
The test-retest method of determining reliability involves administering the same scale or
measure to the same respondents at two separate times to test for stability. If the measure is
stable over time, the test, administered under the same conditions each time, should obtain
similar results. Test-retest reliability represents a measure’s repeatability.
Measures of test-retest reliability pose two problems that are common to all longitudinal studies.
First, the pre-measure, or first measure, may sensitize the respondents to their participation in a
research project and subsequently influence the results of the second measure (you may recall)
b. Validity
Good measures should be both consistent and accurate. Reliability represents how consistent a
measure is, in that the different attempts at measuring the same thing converge on the same
point. Accuracy deals more with how a measure assesses the intended concept. Validity is the
accuracy of a measure or the extent to which a score truthfully represents a concept. In other
words, are we accurately measuring what we think we are measuring?
Establishing Validity
The four basic approaches to establishing validity are face validity, content validity, criterion
validity, and construct validity. Face validity refers to the subjective agreement among
professionals that a scale logically reflects the concept being measured. Do the test items look
like they make sense given a concept’s definition? When an inspection of the test items
convinces experts that the items match the definition, the scale is said to have face validity.
Content validity refers to the degree that a measure covers the domain of interest. Do the items
capture the entire scope, but not go beyond, the concept we are measuring? Criterion validity
addresses the question, “How well does my measure work in practice?” Because of this, criterion
validity is sometimes referred to as pragmatic validity. In other words, is my measure practical?
Criterion validity may be classified as either concurrent validity or predictive validity depending
on the time sequence in which the new measurement scale and the criterion measure are
correlated. If the new measure is taken at the same time as the criterion measure and is shown to
be valid, then it has concurrent validity. Predictive validity is established when a new measure
predicts a future event. The two measures differ only on the basis of a time dimension—that is,
the criterion measure is separated in time from the predictor measure.
Construct validity exists when a measure reliably measures and truthfully represents a unique
concept. Construct validity consists of several components, including
§ Face validity
§ Content validity
§ Criterion validity
§ Convergent validity
§ Discriminant validity
Convergent validity requires that concepts that should be related or are indeed related. Related
concepts should display a significant correlation (convergent validity), but not to be so highly
correlated (above 0.75) that they are not independent concepts (discriminant validity).
Multivariate procedures like factor analysis can be useful in establishing construct validity.
c. Sensitivity
The sensitivity of a scale based on a single question or single item can also be increased by
adding questions or items. In other words, because composite measures allow for a greater range
of possible scores, they are more sensitive than single-item scales. Thus, sensitivity is generally
increased by adding more response points or adding scale items.