0% found this document useful (0 votes)
29 views

GerardGuthrie 2010 14MeasurementPrincipl BasicResearchMethodsA

This document discusses measurement principles in research. It defines key terms like measurement scales, which categorize data as qualities that can be measured as quantities. Even things that seem immeasurable, like nonexistence, can be conceptualized and placed on measurement scales. The document outlines four measurement scales - binary, nominal, ordinal, interval, and ratio - that each add properties to help quantify and analyze variables in social science research.

Uploaded by

Obinna Uba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

GerardGuthrie 2010 14MeasurementPrincipl BasicResearchMethodsA

This document discusses measurement principles in research. It defines key terms like measurement scales, which categorize data as qualities that can be measured as quantities. Even things that seem immeasurable, like nonexistence, can be conceptualized and placed on measurement scales. The document outlines four measurement scales - binary, nominal, ordinal, interval, and ratio - that each add properties to help quantify and analyze variables in social science research.

Uploaded by

Obinna Uba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

contents   149

14
Measurement Principles
All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under U.S. or applicable copyright law.

H ere is a proposition: everything can be measured.


When I was young, many did not like that idea because we thought it dehumanised
us and that some things about people could not be measured anyway.
So, here is a paradox that seems quite illogical: everything can be measured, including
those things that cannot.
There is a trick of course, and that trick is the use of the word ‘measure’. In research,
measure has a particular meaning derived from measurement scales, which are
technically defined methods for classifying or categorising. All information is data
(whether represented by words or numbers) that can be categorised as qualities and,
therefore, measured as quantities.
We can conceptualise and, therefore, classify things that do not exist because ‘things
that do not exist’ is a category that can be represented on a measurement scale. This
comes about because the absence of something is defined in relation to its presence. Only
if we can define something, that is, recognise its qualities or attributes (for example,
‘things that exist’), can we conceptualise its absence (‘things that do not exist’). In both
cases, we are categorising and, thus, measuring them on a binary scale.
This chapter provides principles underlying data collection and analysis. Even if you
do not intend to do quantitative research, you need to understand these principles.
In many ways, these principles make this chapter the most important one in the book
and you should keep revisiting it. The chapter will:

1. define some key measurement terms, using examples of the sort that you might
Copyright 2010. Sage Publications Pvt. Ltd.

find in your own research; and


2. look at some other key measurement principles that underlie all social science
research, both quantitative and qualitative, including hypothesis testing,
probability and randomness.

EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA
AN: 340339 ; Gerard Guthrie.; Basic Research Methods : An Entry to Social Science Research
Account: ns015845.main.ehost
150    Basic Research Methods

14.1 Measurement scales


How are variables measured? A quantity is a quality expressed as a number, which
might be as basic as 1 or 0 on the binary scale used to classify presence or absence. For
example, we can see the effects of the presence of electricity when we turn a switch
on (‘1’) and its absence when we turn it off (‘0’). In a social science research project, we
might interview someone and ask if they have been to school (presence of schooling)
or not (absence of schooling), which can also be coded as 1 or 0. This is simple, but
the approach can be very powerful in practice—it is how computers store data.
Conventionally, four measurement scales are used. These are the nominal, ordinal,
interval and ratio scales, but we need to include the usually overlooked binary scale
too. Table 14.1 shows that each scale adds to the measurement properties of the

Table 14.1 Measurement scales

Scale Characteristics Physical examples Social examples


Ratio Mutually exclusive, equal Weight. Age (0/1/2, etc.).
interval, ordered categories, Funding ($0/$1/$2, etc.).
plus: (a) True zero.
Interval Mutually exclusive ordered Temperature Opinion scored 1–10 on a
categories, plus: (Celsius or response scale.
(a) Categories differ by equal Fahrenheit scales).
amounts.
(b) Arbitrary zero point, if any.
Ordinal Mutually exclusive categories, Height (tall > Level of formal education
plus: medium > short). (tertiary > secondary >
(a) Orders categories logically primary).
as greater than > less than. Rank order (1st > 2nd >
(b) Differences between 3rd in class).
the categories not necessarily Attitude (very good >
equal. good > average > bad >
very bad).
Nominal (a) Classifies object into Types of object Gender (male/female).
different categories. (animal/mineral/ Marital status (unmarried/
(b) Categories mutually vegetable). married/divorced/
exclusive (object can only widowed).
belong to one category). Nationality (Indian/
(c) No logical order to Malaysian/Other).
categories.
Binary (a) Quantifies object as present Electricity (on/off). Schooling (some
or absent. schooling/no schooling).
Source: Author.

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use
measurement principles   151

lower order ones. The mathematical properties of the scales imply different types of
descriptive and inferential statistics that are appropriate. The further up the scales, the
more mathematical information is added, the more precise the measurement and
the more powerful the statistical tests.
Most social science research only has data that can be measured on the nominal and
ordinal scales. Existing groups are predominantly labelled using the nominal scale (for
example, males/females, married/unmarried), or data is collected using the ordinal
scale (for example, strongly agree > agree > disagree > strongly disagree).

14.2 Testing hypotheses


In research, measurement scales order data: (a) informally to explore patterns arising
from the data as we analyse it; and (b) formally to test hypothesised relationships
between variables. As we began to see in Chapter 4, the logic of hypothesis testing is
convoluted.
Research cannot prove correct theories or hypotheses derived from them, it can only
prove them wrong (that is, refute or falsify or reject them). A theory leads to a research
hypothesis predicting a positive relationship between variables. For testing, the re-
search hypothesis must be defined more precisely as an operational hypothesis, that
is, one with carefully defined measurement characteristics. We cannot prove that the
hypothesised relationship exists, so operationally we test for its non-existence using
the null hypothesis, which is a prediction that no difference will be found. In other
words, we do not test the proposition that the research hypothesis is correct (because
it can never be proven); we test the proposition that it is incorrect (because we can
disprove it).

1. Rejection of the null hypothesis gives a difference predicted by the research


hypothesis and the theory from which it derives, which are supported (technically,
the research failed to reject them).
2. If the null hypothesis is supported, the difference predicted by the research
hypothesis did not occur. In this case, the research hypothesis and maybe the
theory are rejected as false.

Box 14.1 shows how formal hypothesis testing and informal searching of data added
to the validity of findings in the teacher education study. The box revises how the
hypotheses were progressively refined to give operational research, for example, by
giving an operational definition of ‘professional acceptability’ as being measured by
eight-item global judgements by inspectors. While the wording of the hypotheses has

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use
152    Basic Research Methods

Box 14.1 Increasing the specificity of hypotheses

Teacher Education Hypotheses


One of the ‘research hypotheses’ in the teacher education study was:
H1 Increased amounts of professional training will result in graduates being rated as more
professionally acceptable by inspectors.
This was revised to become a more detailed ‘operational hypothesis’: increased number of years of
professional training [in six defined teacher education programmes] will result in graduates being
rated as more professionally acceptable using eight-item global judgements by inspectors of teacher
performance in secondary schools.
The statistical procedures were defined as an item analysis using coefficient alpha to test whether
the data from 870 inspection reports were sufficiently reliable to test scale totals, with one-way
analysis of variance testing totals for significance of differences between the programmes with a
0.05 level of confidence.
The ‘null hypothesis’ was:
H0 There will be no statistically significant difference in professional acceptability between
the teacher education programmes.
Source: Adapted from Guthrie (1983a: 23).

quite small differences and seems repetitious, each one refines the previous one to
meet the next step in the formal logic of scientific measurement.
The research hypotheses were already quite specific, so the operational hypotheses
acted to narrow down the measurement possibilities. For example, professional and
general education were measured crudely in years (or parts of years) because appar-
ently more precise options such as numbers of courses or contact hours amongst subjects
varied widely in practice and could not be measured accurately. The null hypotheses
gave the formal statistical logic.
The result was that the statistical tests did not find statistical differences in the
professional acceptability of the graduates from the two different programmes. This
part of the research evidence failed to reject the null hypotheses, so the operational
and research hypotheses were apparently not supported. However, the programmes
themselves were very different in approach, content, length and costs. The longer, more
expensive programmes were no more effective than the shorter, cheaper ones.
In fact, the operational and null hypotheses were not provided formally in the report,
but were implied in lengthy discussion of their elements in the text. In sociological
research, this is usually acceptable, but in experimental research, they would probably
be written out.

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use
measurement principles   153

14.3 Probability
Null hypotheses are tested statistically. Because we cannot be absolutely sure of
anything, the test results are expressed as probabilities. In science, we use the probability
level to predict the likely occurrence of predicted events. The social sciences usually set
95 per cent as the acceptable likelihood of an outcome occurring. Behavioural sciences
like psychology also use a 99 per cent level. Biological and, especially, physical sciences,
which are better able to control the variables they study, can go higher.
Statistical analysis does not usually express the outcomes of hypothesis testing as
levels of probability (chances of being right), but as levels of confidence (the chances of
not being wrong). One provides a balance for the other (a 95 per cent probability gives
a 5 per cent level of confidence), but levels of confidence are expressed as decimals
(5 per cent is expressed as .05).
A .05 level of confidence expresses the level of confidence we have when we reject
the null hypothesis that there is no difference. It shows we have a result with only one
chance in 20 of being wrong in finding a difference and, by interpretation, of failing to
reject the research hypothesis predicting the difference. A .01 level is a higher level of
confidence (despite the smaller number) because it provides only one chance in 100
of being wrong. Thus, with a .03 result, for example, we can say we reject the null
hypothesis at a .05 level or greater (> .05).
When we find a statistically significant difference, the interpretive term ‘accept’
can be used in practice instead of the correct but clumsy ‘fail to reject’. Note in the
report that you are doing this so that the technical reader knows you understand the
difference.
A .05 level of confidence means we are vulnerable to two types of error on 5 per
cent of occasions:

1. Type I errors are false positive results (that is, incorrect rejection of the null
hypotheses).
2. Type II errors are false negative results (that is, incorrect acceptance of the
null hypotheses).

These are not a result of incorrect use of statistical tests or of mistakes in computation,
but are random consequences of the use of probabilities in sampling.

14.4 Randomness
Randomness is a mathematical principle affecting all research. This is true, too, of
research that does not involve quantification and statistics. Selection of case studies

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use
154    Basic Research Methods

and use of qualitative techniques do not make randomness go away; they merely make
it less transparent and more prone to reliability issues.
Random events are ones where we cannot predict each outcome individually
(which is much the same as saying something was an accident). We might know that
there is a pattern of events with certain levels of probability, but we cannot know
what the next event will actually be. For example, we know that 50 per cent of the time,
on an average, a tossed coin will show heads, and the other 50 per cent of the time tails
will come up, but we cannot know whether the next toss will be a head or a tail.
The probability levels used in social science are a reflection that statistical testing
provides findings that are highly calculated gambles. At a .05 level of confidence, we
know that we will be right 19 times out of 20 on an average, but we do not know which
one of the 20 occurrences will have the chance error.
If a large number of statistical tests give some unexplained results, random sampling
error could be the reason. For example, the possibility of Type I error was a point of reference
when sample results from the crime victimisation surveys were synthesised, which
Box 14.2 illustrates.

Box 14.2 Type I error in sampling

Sampling Error
The 16 crime victimisation survey samples were each tested against four variables of age, gender,
marital status and education using population data from the 2000 Papua New Guinea Census. The
level of confidence was set at .05.
Age means and standard deviations for all samples were statistically acceptable, with one
exception out of 16 (i.e., 6%, which was 1% above the permissible level of 5% of Type I false
positive errors). Arawa, in 2006, had a sample age mean of 30.3 years compared to the census
result of 31.8 years for the 15+ population, i.e., there was a slightly lower average age of 1.5
years below the census, and a narrower age range than expected (a standard deviation of 10.2
years rather than 12.4).
Technically the sample should have been rejected, albeit by a very small 1% margin, but it was
not, for four reasons. First, the census was taken close to the end of a civil war, and there was
no guarantee that it was very accurate. Second, it was possible that the population numbers had
changed somewhat as peace developed and the town was resettled. Third, the difference between
the sample and the population means was small. Fourth, the age and gender cohorts and all other
tested parameters matched the census. The age data was used with appropriate qualification.
However, three of the 16 samples had unacceptable differences in married numbers and three
had unacceptably high levels of people with technical/university education. Both marital status
and educational level had 19% of null hypotheses rejected, which was well above the permissible
5% level. The reports used population estimates based on age and gender, and did not present
data based on marital status or educational levels.

Source: Adapted from Guthrie (2008: Appendix C).

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use
measurement principles   155

A further complication is that the chances of error apply not only to individual tests,
but also to samples as a whole. Not only might the statistical tests be wrong on 5 per
cent of occasions, 5 per cent of samples and the results derived from them might be
wrong too. This is part of the reason why there is controversy over so many research
findings—apparently similar studies can produce different results. The problem can
be a result of random sampling error in some of the studies, which is one reason why
meta-analyses group studies together to check the distribution of results.

14.5 Summary
Measurement scales define how to classify or categorise. All information is data,
represented by words or numbers that can be categorised as qualities and measured
as quantities.

Measurement scales
1. Five types of measurement scale are binary, nominal, ordinal, interval and ratio.
The further up the scales, the more precise the measurement.
2. Most social science research only has nominal and ordinal data.

Testing hypothesis
1. Measurement scales order research data informally to explore patterns in the
data, and formally to test hypothesised relationships between the ordered
variables.
2. Research cannot prove correct theories or hypotheses derived from them; it can
only prove them wrong.
(a) If the null hypothesis is rejected, the difference predicted by the research
hypothesis occurs so it is not rejected as false.
(b) If the null hypothesis is supported, the difference predicted did not occur
and the research hypothesis is rejected as false.

Probability
1. Statistical results are probabilities. The social sciences usually accept a 95 per
cent level.
2. Type I errors are false positive results. Type II errors are false negative results.
These are random consequences of sampling error.

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use
156    Basic Research Methods

Randomness
Random error also affects samples as a whole.

Conducting qualitative research might lead to the false conclusion that measurement
is unnecessary. Both words and numbers must be analysed carefully. Just because you
have collected words does not make them meaningful, especially if they are presented
in a disorganised way. Does some data interpretation—whether of words or numbers—
seem confused? Perhaps you are mixing variables or misinterpreting the underlying
measurement scales. Even if a research project completely avoids numbers, it still needs
an understanding of basic measurement principles.

14.6 Annotated references


Babbie, E. (2007). The Practice of Social Research, 11th edition. Belmont: Wadsworth.
This sociology text has chapters on measurement scales and qualitative and
quantitative data analysis.
Cozby, P. (2009). Methods in Behavioral Research, 10th edition. Boston: McGraw Hill.
A psychology text with clear chapters on measurement and statistical principles.
Desai, V. and R. Potter (eds). (2006). Doing Development Research. New Delhi: Vistaar.
The readings in this book on development research include chapters on quantitative
and qualitative research.
Scheyvens, R. and D. Storey (eds). (2003). Development Fieldwork: A Practical Guide.
London: Sage.
A comprehensive collection on fieldwork in developing countries containing chapters
on both quantitative and qualitative research.

EBSCOhost - printed on 10/4/2021 9:07 AM via AMERICAN UNIVERSITY OF NIGERIA. All use subject to https://www.ebsco.com/terms-of-use

You might also like