0% found this document useful (0 votes)
121 views10 pages

Hammersley Some Notes On Reliability

The document discusses differing definitions of validity and reliability found in methodological literature. It notes a lack of standardized terminology and discrepancies between definitions, with terms sometimes used to refer to different things. The author aims to analyze the conceptual issues and clarify meanings of validity and reliability in relation to the goals of measurement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views10 pages

Hammersley Some Notes On Reliability

The document discusses differing definitions of validity and reliability found in methodological literature. It notes a lack of standardized terminology and discrepancies between definitions, with terms sometimes used to refer to different things. The author aims to analyze the conceptual issues and clarify meanings of validity and reliability in relation to the goals of measurement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

BERA

Some Notes on the Terms 'Validity' and 'Reliability'


Author(s): Martyn Hammersley
Source: British Educational Research Journal, Vol. 13, No. 1 (1987), pp. 73-81
Published by: Wiley on behalf of BERA
Stable URL: http://www.jstor.org/stable/1501231
Accessed: 14-02-2016 22:35 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected].

Wiley and BERA are collaborating with JSTOR to digitize, preserve and extend access to British Educational Research Journal.

http://www.jstor.org

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
British
Educational
ResearchJournal,
Vol.13,No. 1, 1987 73

Some Noteson the Terms'Validity'


and 'Reliability'[1y

MARTYNHAMMERSLEY, SchoolofEducation,TheOpenUniversity

The problemof measurement is oftenaddressedby meansof the conceptsof


validityand reliablity. Some social scientistsare concernedto showthattheir
measurements are reliableand,less commonly, thattheyare valid.This is more
frequent in 'quantitative'thanin 'qualitative'
research,butthebasicissuesapplyto
both,as is increasingly beingrecognised byqualitativeresearchers(Dobbert,1982;
Evans,1983;Goetz& Le Compte,1984;Kirk& Miller,1985).
Thereis a largeliterature dealingwiththeconceptsof reliability and validity.
However,muchof it concernsthe techniquesby whichtheseproperties can be
measured.Ratherlessattention has beengivento theconceptualissuesinvolved.
And,in fact,whenonelooksat discussions ofreliability
andvalidity onefindsnota
clear set of definitions but a confusing diversityof ideas. Thereare substantial
divergencies amongdifferent authors'definitions,
and thereis evensomeoverlap
betweendefinitions ofthetwoconcepts.Theresultis thatitis oftenunclearwhatis
beingassertedwhenreliability and validityclaimsare made,and therefore it is
difficult
to assesstheirtruth.In thispaperI shallbe concerned almostexclusively
withthemeanings giventothosetermsandwithhowtheserelatetothegoalsofthe
measurement process.

Variations
in Definition
Here is a sampleof the definitionsof the terms'reliability'
and 'validity'to be
foundin themethodologicalliterature[2]:
(1) Reliability
is theagreement betweentwoefforts to measurethesame
traitthrough maximally similarmethods.Validityis represented in the
agreement betweentwo attemptsto measurethe same traitthrough
maximally differentmethods. (Campbell& Fiske,1967:277)
(2) The validityofa measuring instrument is defined
as theproperty ofa
measurethatallowstheresearcher to say thattheinstrument measures
whathe saysit measures.. . The reliabilityof a measuring instrumentis

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
74 M. Hammersley

oftheinstrument
definedas theability tomeasure thepheno-
consistently
menon tomeasure.
itis designed (Black& Champion,1976:222 and 234)
(3) Reliability
refersto thereproducibility of themeasurements. Can we
relyon ourownabilityto obtainverysimilardataagain;thatis,howgood
is our intra-observerreliability?
Otherobserversshouldalso be able to
replicateour measurements whichmeans,in part,thatwe shouldhave
goodinter-observer reliability.
Thisis,ofcourse,oftendifficult, sinceskill
in observationdevelopsthrough practice . . . Hollenbeck (1978)concluded
thatreliabilityconsistsof bothstability and accuracy.However,thisis
trueonlyof interobserver . . an observermaybe reliable
reliability. but
stillhave poor accuracyas long as precison (stability)is maintained.
Therefore, intra-observerreliabilityis solelya measureof stability (or
precision)whereasaccuracyaffectsvalidity... However,accuracywill
almostcertainly affectinter-observerreliability since fewobserversare
likelyto havethesamebiases.Anaccuracy criterion canbe established by
usingan 'expert'observer or theconsensusofseveralobservers. (Lehner,
1979: 130)
(is)theextentto whichrepetition
(4) Reliability ofthestudywouldresult
in thesamedataand conclusions. (Goode & Hatt,1952: 153)
(5) The goal of anyscientificmeasurement operationor procedure is to
arriveat thebestpossibleestimate ofthetruevalueofsomedimensional
qualityof a naturalphenomenon. To theextentthatthisgoalis achieved
it is saidthatthemeasurement is accurateorvalid.Accuracy orvalidity of
theresultstherefore becomestheyardstick forgauging thequalityofany
measurement procedure.For purposesof clarity, accuracy (or validity)
maybe definedas the extentto whichobtainedmeasuresapproximate
valuesof the'true'stateof nature.. . Reliabilityrefersto thecapacityof
theinstrument to yieldthesame measurement valuewhenbrought into
repeatedcontactwiththe same stateof nature.Thus,thismeaningof
reliabilityis concerned withthestability of measuredvaluesundercon-
stantconditions. (Johnston & Pennypacker, 1980: 190 and 191)
is theaccuracy
(6) Reliability orprecisionof a measuring instrument...
The commonest definition ofvalidityis epitomized by thequestion:Are
we measuring whatwe thinkwe are measuring? The emphasisin this
questionis on whatis being measured.For example,a teacherhas
constructed a testto measureunderstanding procedures
of scientific and
has includedon thetestonly factual
itemsaboutscientific procedures.The
testis notvalidbecause,whileit mayreliably measurethepupils'factual
knowledge procedures,
of scientific it does notmeasuretheirunderstand-
ingof suchprocedures. In otherwords,it maymeasurewhatit measures
quite well,but it does not measurewhat the teacherintendedit to
measure.(Kerlinger, 1964:430 and 444-5)
(7) A measureis reliableto theextentthattheaveragedifference between
two measurements independently obtainedin the same classroomis
smallerthantheaveragedifference betweentwomeasurements obtained
classrooms
in different . . . A measureis validto theextent thatdifferences
in scoresyieldedby it reflect actualdifferencesin behaviour notdiffer-

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
'Validity'
and'Reliability' 75
ences in impressions made on different observers.(Medley& Mitzel,
1963: 150)
One sourceof problems is thelackof a standardized terminology, so thatseveral
termsareusedto refer to eachofthedifferent aspectsofthemeasurement process.
Indeed,sometimes different authorsuse thesametermto refer to different things;
and eventhesame authormayuse a termto denotedifferent thingson different
occasions.An exampleis the term'measure'whichcan referto a measuring
instrument or to a particularmeasurement score.Forthepurposesofmyargument
hereI shalluse thefollowing termsand definitions:
measurement: theprocessbywhichan observer appliesan instrument to objects
in orderto gaugethepresence/magnitude ofa property [3];
property: thefeature ofobjectswhichis to be measured;
instrument:a proceduredevelopedto measurethe presence/magnitude of a
property in theobjects;
objects: the phenomena(people,lessons,tasks,etc.)whosepossessionof
theproperty is to be assessed;
scores: theresultsofthemeasurement process;
occasion: thetimeand placewheretheinstrument is appliedto producethe
scores;
observer: thepersonwhocarriesoutthemeasurement.
Usingthisterminology, let us look now at the majordiscrepancies amongthe
definitionsofvalidityand reliability citedabove.
(a) Arereliability and validityconcerned withall aspectsof a studyor do they
relateonlyto theprocessof measurement? Whilemostdefinitions takethelatter
position,someimplytheformer. For example,Goode & Hatt(1952: 153) define
reliability
as "theextentto whichrepetition of thestudywouldresultin thesame
data and conclusions".In otherwordstheyidentify it withreplication, and this
clearlyinvolvesmorethanmeasurement.
In thecase oftheterm'validity', thereis theproblem oftherelationship between
twotypologies: criterion,
predictive, concurrent, content,face,and construct valid-
ityon theone hand;internal, external, populationand ecologicalvalidityon the
other.The formerrefersto measurement, the latterto the whole processof
assessingthetruthof explanatory claims.In addition,thetermvalidityis some-
timesused to referto the assessmentof arguments in termsof whetherthey
conform to legitimatedeductive canons.
(b) Arevalidityand reliability properties ofinstruments, observers, orofparticu-
lar scores?Goode & Hatttreatreliability as a feature ofdataand conclusions. For
themostpart,though, reliability seemsto be viewedas a property ofinstruments
and/orobservers. Validityis sometimes ascribedto instruments (Black& Cham-
pion,Kerlinger, Medley& Mitzel),sometimes to observers (Lehner),sometimes to
scores(Johnston & Pennypacker).
(c) Arevalidityand reliability to be defined in termsoftherelationship between
scoresand variationin theproperty beingmeasured? (Call theserealistdefinitions).
Or are theyto be definedin termsoftherelationships amongscoresproducedby
thesameand/ordifferent instruments? (Call thesenominalist definitions)
[4]. Most
definitionsofvalidityarerealist, claiming, forexample,thatvalidity represents the
extentto whichan instrument measurestheproperty it is intendedto measure.
However, there are exceptions.For instance,"validityis representedin

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
76 M. Hammersley

theagreement betweentwoattempts to measurethesametraitthrough maximally


different methods"(Campbell& Fiske,^ 1967:277).
By contrast, mostdefinitions are nominalist,
of reliability referring to thescores
producedby repeatedefforts to measurethesameproperty by meansofthesame
instrument. Once again, though,thereare apparentexceptions.For example,
Kerlinger (1964: 430) claimsthatreliability is "the accuracyor precisionof a
measuring instrument", and Black& Champion(1976:234) define itas "theability
of theinstrument to measureconsistently thephenomenon it is designed tomea-
sure"(myemphasis).
(d) If we accept a realistdefinition of validity,are we concernedwiththe
relationship betweenthescoresand theproperty or thatbetween thescoresandthe
objects.In shortis it theproperty or theobjectswhichare beingmeasured? Most
definitions referto measurement of the property, but some takethe alternative
view. For instance,Johnston & Pennypacker (1980: 190) define'accuracy(or
validity)'as "the extentto whichobtainedmeasuresapproximate valuesof the
'true'stateofnature."Medley& Mitzel's(1963: 150)definition is another example:
"A measureis valid to the extentthatdifferences in scoresyieldedby it reflect
actualdifferences in behaviour.. ." Thisis probably justa terminological problem,
butit is a potential sourceofconfusion.
(e) If we accepta nominalist definition are we concerned
of reliability, withthe
relationship betweenscoresproducedbythesameinstrument appliedto thesame
objecton thesame occasionby different observers, or thoseproducedby similar
instruments appliedto differentobjectson different occasionsby thesameobser-
ver,or to some otherset of permutations of observer,instrument, objectand
occasion?
Atone extreme is Lehner's(1979: 130)definition:
Reliability Can we rely
ofthemeasurements.
to thereproducibility
refers
on ourownabilityto obtainverysimilardataagain.. .? Otherobservers
ourmeasurements.
shouldalso be ableto replicate
Hereit seemsthatreliability relatesto scoresproducedby anyobserver on any
occasionusinganyinstrument to measuresomeset of objects.Medley& Mitzel
includevariations
(1963: 150)explicitly in themagnitude ofa property in an object
betweenoccasionsas a sourceof unreliability. By contrast, Johnston & Penny-
packer(1980: 191) adopta rathermorerestricted definition:"reliabilityrefers
to
thecapacityoftheinstrument to yieldthesamemeasurement valuewhenbrought
intorepeatedcontactwiththesamestateof nature".Mostdefinitions are unclear
aboutwhatpermutations of instrument, observer,objectsand occasionproduce
scores.And,indeed,Campbell& Fiske(following
reliability Thurstone, 1939)have
arguedthatvariationin thesecomponents producesa continuum fromreliablityto
validity.Even if this is so, however,it is important to get clearwhatis being
measured, howand why.
Thereare someimportant variations betweenauthorsin thewaytheyuse the
and 'validity',
terms'reliability' then.Moreover,sometimes thesameauthorwill
movebetweendifferent definitionsof thesetermswithout warning. For example,
despitethe nominalistic definitionof validityquoted above, laterin the same
articleCampbell& Fiske(1967) implicitly discussing
adopta realistdefinition, the
differentinterpretations thatcan be made of a failureto findany correlation
betweenthe scoresproducedby two methodsintendedto measurethe same

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
'Validity'
and'Reliability' 77
property. If theywereto be consistent withtheirnominalist definition,
no such
problemof interpretation wouldarise,theconclusion wouldbe thatthevalidityof
thescoresis lowor zero.In a similarway,Kerlinger (1964) movesthrough various
definitionsof reliability
withoutaddressing the issueof therelationships among
them.
Anothercommonpracticeis to conflatedefinitions of reliability
in termsof
consistency ofscoreswithdefinitionsin termsofrandomerror:
Reliability
concerns theextentto whichmeasurements areconsistent
and
repeatable.Thus,a highlyreliablemeasureis one thatdoes notfluctuate
greatlybecauseofrandomerror.(Zeller& Carmines,1980: 17)
We havetwodefinitions ofreliability
herewhichdo notmatchone another. While
randomerrorwillproduceinconsistency in scores,so willcertainkindsofsystema-
tic error.For example,wherescoresproducedby two observers are affectedby
biaseswhichoperatein oppositedirections inconsistencies betweenthescoresof
theobservers forthesameobjectswillresult.

AnAttempt
at Clarification
I have triedto showthatthereis some inconsistency in theusageof theterms
'reliability'
and 'validity'.At theriskof addingto theconfusion, I wantto tryto
clarify theconceptsunderlying theseterms.
It is important to beginbymakinga cleardistinction betweengoalsand means,
betweenwhatit is aboutthemeasurement processwe are tryingto assessand the
strategies we use to assessit. Onlywhenwe are clearaboutwhatit is we wantto
assesscan we deviseeffective strategiesforachieving
that.
Ourprimary concernin measurement mustsurelybe whether thesetofscoreswe
haveproducedaccurately reflectsthepresence/magnitude ofthetarget propertyin
theobjectswe havemeasured.Thisis whatmostwriters seemto meanbyvalidity
[5]. Thereare a numberof typesof threatto measurement validity,but we can
distinguish two main sources.If we thinkof measurement as involving, at its
simplest, a relationshipbetweena variablewhichis notdirectly observable andone
thatis, theremay be inaccuraciesin the recording of scoresof the observable
variable(we mightreferto thisas theproblemof 'accuracy')and theremaybe
errors arisingfromimperfect correlation betweentheobservedand theunobserved
variables(thisis oftenreferred to as theproblemof'constructvalidity').
However,validityis not our onlygoal. We are also ofteninterested in the
precision withwhichany particular scorecapturesthe magnitudeof the target
property in an object.Precisionconcernsthedelicacyof themeasurement scale
employed.We can measurethe lengthof a largeobject in termsof metres,
centimetres or evenmillimetres. In thatorderthesescalesrepresent an increasing
degreeof precision. Notethatthisis independent oftheaccuracy of themeasure-
ment.On thisusagea scoremaybe veryprecisebuthighly inaccurate. How precise
we wantour measurement to be willdependupon our purposes,but it willalso
dependupon the level of validitywhichcan be obtainedat different levelsof
precision. Otherthings beingequal,themoreprecisethescale,themoredifficult it
is to achievehighlevelsofvalidity.And,indeed,thereis oftena temptation to be
moreprecisethanthe level of validitywithwhichan objectcan be measured
justifies.

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
78 M. Hammersley

a thirdgoal?It maybe,butonlyifdefined
Is reliability in realistterms.Achieving
consistency ofscoresacrossoccasionsis ofno valuein itself, itonlyhasvalueas an
indicatorof validity.If, on the otherhand,we treatreliability as a property of
instruments notof scores,and defineit as theabilityofan instrument consistently
to producevalidscores,thenassessing ofinstruments
thereliability anddeveloping
reliableinstruments are clearlyimportant goals.On thisdefinition we can havea
scoreofhighvaliditywithout theinstrument whichproduceditbeingreliable, but
we cannothavea reliableinstrument producing invalidscores[6].However, itmay
be difficultto knowthatwe have a scoreofhighvaliditywithout also findingout
whether we have a reliableinstrument sincethesame strategies are involvedin
assessing bothvalidityand reliability,on thedefinitions usedhere.
Validityand appropriate precisionof scores,and reliabilityof instruments, are
ourgoals,then.Butofcoursethecentralproblemin measurement is thatgenerally
we haveno directaccesstotheproperty we aretrying to measure, andthuswehave
no straightforward meansofassessing thevalidity ofanyparticular score.Ifwedid
have directaccess we wouldpresumably have no need of anymeasuring instru-
ment.In assessingthevalidityof scoresand thereliability ofinstruments we have
to relyupon comparisons of the scoresproducedunderdifferent circumstances,
circumstances systematically variedin orderto assesstheeffects ofdifferent types
of threatto measurement validity.To theextentthatscoresare consistent across
thesedifferent circumstances, we can haveincreasedconfidence thattheyarevalid
and thattheinstrument is reliable.

TypesandSourcesofError
random,
between
& Costner's(1977:24-6) distinction
Mueller,Schuessler constant
erroris moreuseful,I believe,thanthe morecommontwofold
and correlated
betweenrandomand systematic
distinction error,sincethetwokindsofsystematic
[7]. Herearetheauthors'definitions:
characteristics
errorhavedifferent

Randomerrors: of
Randomerrorsbehaveas iftheamountand direction
by drawingsignednumbers
errorweredetermined froma hat,withone
andthe
inthehatbeingpositiveandonehalfnegative
halfofthenumbers
beingzero.(p. 24)
averageofthenumbers

Constanterrors: It is as iftheerrorweredeterminedbydrawingnumbers
froma hat, but the averageof the numbersin the hat is not zero;
consequentlyeach scoreis inflated bythesameamounton
(or deflated)
theaverage.(p. 25)

errors:
Correlated Correlatederrorbehavesas iftheerrorweredetermined
by drawingnumbersfromhats,but a different different
hat(containing
numbers)was used formalesand females,or forrichand poor,or for
(p. 25)
groupings.
otherdifferentiated

typesoferror,
sourcesof errorare likelyto lead to different
Different thoughas
yetweknowtoolittletobe abletotieparticular typeswithany
sourcestoparticular
We can onlymakesuggestions
certainty. as to likelylinks.For example.

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
'Validity'
and'Reliability' 79

Sourcesoferror Probabletypesoferror
Observer observation
and coding random,constant
or
inaccuracies correlated
calculation
mistakes randomor constant
interpretational
bias constant
or correlated
Instrument contaminationofscoresby
factors
otherthantheproperty
beingmeasured random,constant
or
correlated
Comparisonsof scoresproducedunderdifferentcircumstances
mayallowus to
assesstheeffects
ofdifferentsourcesand typesoferror:

Observer By comparing scoresforthesame objectsproducedby thesame


observer usingthesameinstrument on differentoccasionswe can
make some estimateof the level of errorderivingfromintra-
observervariation, so longas we can assumethattheproperty is
stable in the object across occasions(or at least betweenthe
occasionson whichmeasurement occurred) andthattheobserver's
secondscoreis independent oftheearlierone.
By comparing scoresforthesame objectsproducedby different
observers usingthesameinstrument on thesameoccasionwe can
assess the level of inter-observer variation,so long as we can
assumethatall observers used theinstrument in an appropriate
way,and thattheirscoreswereproducedindependently.
Instrument By comparing thescoresforthesameobjectsproducedbytwoor
moreinstruments intended tomeasure thesameproperty usedbythe
same observeron different occasions,we can estimatetheerror
arisingfromthe instrument, so longas we can assumethatthe
property is stableacrossoccasionsand thatintra-observer erroris
low.
By comparing thescoresforthesameobjectsproducedbytwoor
moreinstruments intendedto measurethesameproperty usedby
differentobservers on thesameoccasion,wecanestimate theerror
arisingfromtheinstrument so longas we can assumethatinter-
observer erroris low[8].
I shallnot discussthestatistical techniquesavailableto estimatethedifferent
typesoferroron thebasisofscoresfromcomparisons oftheseand otherkinds.A
numberof different approachesare available,thoughnone is unproblematic
(Tryon,1957;Lord & Novick,1968;Heise & Bohrnstedt, 1970;Cronbachet al.,
1972;Zeller& Carmines,1980).And thereare otherkindsofcomparison which
can provideevidenceaboutthevalidityof a set of scoresor thereliability ofan
instrument. For example,we mightinvestigate the extentto whichthe scores
correlatewith scoresproducedby anotherinstrument designedto measurea
variablewhichis knownto correlatestrongly withthevariablewe are trying to
measure(Cronbach& Meehl,1955).Again,ifwehavegoodreasontobelievethata
particularset of objectshas a substantially higherlevel of the property we are

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
80 M. Hammersley

by
seekingto measurethananothersetofobjects,thenwe can testourinstrument
measuringthesetwo sets of objectsto discoverwhetheror not the expected
is to be found.
difference

Conclusion
My concernin thispaperhas beenwiththeconceptual issuesinvolvedin defining
I have proposeddefinitions
validityand reliability. of validity,precisionand
as goalsofthemeasurement
reliability process.Theseareto be distinguished from
the strategieswhichwe use to achievethem,whichinvolvethe comparisonof
scoresproducedunderdifferent circumstances.These comparisons allow us to
assesstheeffects typesand sourcesoferror,
ofdifferent and theyprovideus witha
Considerable
basisforassessingbothvalidityand reliability. workis stillrequired
in developing and applying However,a prerequisite
thesestrategies. foreffective
workin thisarea,it seemsto me,is to be clearaboutwhatit is we are aimingto
achieve.I havetriedto showthatat presentourusageofconceptslikevalidity and
and thispaperhas been directedtowardsa
is vagueand inconsistent,
reliability
ofthesemeasurement
clarification goals,and theirrelation designed
to strategies to
assessthem.

Correspondence:
M. Hammersley, School of Education,The Open University,
WaltonHall,MiltonKeynes,BucksMK7 6AA,England.

NOTES
[1] I am obliged to John Scarth,Donald MacKinnon,BarryCooper and JohnBynnerfor
comments on earlierdraftsofthisarticle.The errorsare ofcoursemine.
[2] This is a haphazardsample,butit does illustrate therangeofvariationin usage.
[3] I put on one side the questionof whetherit is legitimateto talk of classification as
measurement.
[4] The terms'realist'and 'nominalist'are used in a varietyof waysby philosophers. I use the
termsheresimplyas shorthand.
[S] We probablyneed to use some adjectivelike 'measurement' validityhereto
or 'descriptive'
distinguish whatwe are referring to fromlogicalvalidityand frominternalvalidity.
[6] Incorrect use ofa reliableinstrument wouldproduceinvalidscores,butitis betterto treatthis
as use ofa different instrument.
[7] It is also important to recognise,as Cronbachet al., 1972 emphasise,thatwhatis systematic
errorgivenone focusmay be variationin the targetproperty fromanotherpointof view.
Identification ofsystematic erroris relativeto thepropertybeingmeasured.
[8] Thesevariouscomparisons are ofcourseproducedsimplybycombining conventionalreliabil-
ity and validitychecks.There are additionalpossibilitiesin researchemployingtestsor
inventories, suchas theuse ofthesplithalftechniqueor Cronbach'scoefficient alpha.

REFERENCES
BLACK,J.A. & CHAMPION, D. J.(1976) Methodsand Issuesin Social Research(New York,Wiley).
CAMPBELL,D. T. & FISKE,D. W. (1967) Convergent and discriminant validationbythemultitrait-
multimethod matrix,in: W. A. MEHRENS & R. L. EBEL(Eds) Principlesof Educationaland
Psychological Measurement (Chicago,Rand McNally).
CRONBACH, L. J. & MEEHL,P. E. ( 1955) Constructvalidityin psychological tests,Psychological
Bulletin,52, pp. 281-302.
CRONBACH, ofBehavioural
L. J.et al. (1972) TheDependability Measurements (New York,Wiley).

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions
'Validity'
and'Reliability' 81

DOBBERT, M. (1982) Ethnograpic Research:theory and application formodernschoolsand society


(New York,Praeger).
EVANS, J.(1983) Criteriaofvalidityin socialresearch: exploringtherelationship betweenethnogra-
phic and quantitativeapproaches,in: M. HAMMERSLEY (Ed.) The Ethnography of Schooling
(Driffield,Nafferton).
GOETZ,J. & LECOMPTE, M. (1984) Ethnography and QualitativeDesignin EducationalResearch
(New York,AcademicPress).
GOODE,W. & HATT,P. K. (1952) Methodsin Social Research(New York,McGrawHill).
HEISE,D. & BOHRNSTEDT, G. (1970) Validity,invalidityand reliability, in: E. BORGATTA & G.
BOHRNSTEDT (Eds) SociologicalMethodology (San Francisco,Jossey-Bass).
HOLLENBECK, A. R. (1978) Problemsofreliabilityin observational research,in: G. P. SACKETT (Ed.)
Observing Behaviour,Vol.2, Data Collectionand AnalysisMethods(Baltimore, University Park
Press).
JOHNSTON, J. M. & PENNYPACKER, H. S. (1980) Strategiesand Tacticsof Human Behavioural
Research(Hillsdale,New Jersey, Erlbaum).
KERLINGER, F. (1964) Foundations ofBehaviouralResearch(NewYork,Holt,Rinehart & Winston).
KIRK,J. & MILLER,M. L. (1985) Reliability and Validityin QualitativeResearch(BeverleyHills,
Sage).
LEHNER, P. N. (1979) HandbookofEthological Methods(New York,GarlandSTPM Press).
LORD,F. M. & NOVICK, M. R. (1968) StatisticalTheoriesofMentalTestScores(Reading,Mass.,
Addison-Wesley).
MEDLEY,D. M. & MITZEL,H. E. (1963) Measuring classroombehaviourbysystematic observation,
in: N. L. GAGE(Ed.) HandbookofResearchon Teaching(Chicago,Rand McNally).
MUELLER, J.H., SCHUESSLER, K. F. & COSTNER, H. L. (1977) Statistical
ReasoninginSociology, 3rd
edn (Boston,HoughtonMiflin).
THURSTONE, L. L. (1937) TheReliabilityand Validity of Tests(AnnArbor,Edwards).
TRYON,R. C. (1957) Reliabilityand behaviourdomainvalidity,Psychological Bulletin,54, pp.
229-49.
ZELLER, R. A. & CARMINES, E. G. (1980) Measurement in theSocial Sciences(London,Cambridge
University Press).

This content downloaded from 137.108.145.45 on Sun, 14 Feb 2016 22:35:59 UTC
All use subject to JSTOR Terms and Conditions

You might also like