Understanding and validity in qualitative research
Understanding and validity in qualitative research
Qualitative Research
JOSEPH A. MAXWELL
Haroard Graduate School of Education
Within the last few years, the issue of validity in qualitative research has come
to the fo re . (Kvale, 1989, p. 7)
All field work done by a single field-worker invites the question , Why should
we believe it? (Bosk, 1979, p. 193)
Validity has long been a key issue in debates over the legitimacy of qualitative
research ; if qualitative studies cannot consistently produce valid results, the n
policies, programs, or predictions based o n these studies can not be relied o n .
Proponents of quantitative and experimental a pproac hes have frequently c riti-
cized the absence of Mstandard" means of assuring validity, such as quantitative
measureme nt, explicit con tro ls for vario us validity threats, and the formaJ testing
of prior hypotheses. Their critique has been bolstered by the fact that existing
categories of validity (for example, conc urrent validity, predictive vaJidity, con-
verge nt validity, criterion-related vaJidity, internal/external validity) are based
on positivist assumption s that underlie quantitative and experimental research
279
Harvard Educational Review
I Th~ d iffer~nc~ br:tween approaches to validity based on ",xcmplaTli and those: based on typologia
can be ~", n ~ one example of the distinction between syntagmatic (contcxtualil ing or contiguity-
based) and paradigmatic (calegori~ing o r similarity·bascd) su-ategies (Brune r , 1986;Jakobson. 1956;
Maxwe ll & Miller, n.d.). It is ullderstandabl~ that Mishl er prefers a syntagmatlc model for \I'<l li dity. since
his overall orie ntation to research is primarily syntagmatic rathu than paradigmatic in it5 emphasis o n
contextual and narrative anal)'!!;s rather than categoril.ation and compari.:wm (Mishle r. 1986) . My
approac h. in con trast, emphasi~es the complementarity of syntagmatic and paradigmatic s trategi~$.
280
Qualitative ReseaTm
JOSEPH A. MAXWELL
purchased with techniques .... Rather, validity is like integrity, character. and
quality, to be assessed relative to purposes and circumstances" (1985, p. 13).
But defining types of validity in terms of procedures, an approach generally
labeled instrumentalist or positivist. is not the only approach available. The most
prevalent alternative is a realist conception of validity that sees the validity of an
account as inherent, not in the procedures used to produce and validate it, but
in its relationship to those things that it is intended to be an account of (Ham-
mersley, 1992; House, 1991; Maxwell, 1990a,b; Norris, 1983). This article is not
a response to or critique of Mishler's approach, but an alternative, complemen-
tary view: it presents a realist typology of the kinds of validity that I see as relevant
to qualitative research. 2
In adopting a realist approach to validity, I am in basic agreement with the
main point of Wolcott's critique - that is, that understanding is a more funda-
mental concept for qualitative research than validity (1990a, p. 146).1 see the
types of validity that I present here as derivative from the kinds of understanding
gained from qualitative inquiry; my typology of validity categories is also a typol-
ogy of the kinds of understanding at which qualitative research aims (see Runci-
man, 1983).
However, in explicating the concept of validity in qualitative research, I want
to avoid applying or adapting the typologies developed for experimental and
quantitative research, for reasons quite separate from Mishler's disapproval of
procedure-based typologies. These typologies cannot be applied directly to qual-
itative research without distorting what qualitative researchers actually do in
addressing validity issues, and tautologically confirming quantitative researchers'
critiques.
An illustration of this is Campbell and Stanley's (1963) attack on what they
disparagingly refer to as the "one-shot case study." They argue that this design
is "'well-nigh unethicar' on the grounds that a single observation of one group,
following an intervention, with no control groups or prior measures, provides
no way of discriminating among numerous possible alternative explanations for
the outcome, explanations that any valid design must be able to rule out. From
an experimentalist's perspective, this argument is perfectly logical, but it com-
pletely ignores the ways that qualitative researche rs actually rule out validity
threats to their conclusions. Campbell later recognized the fallacy in his earlier
critique and retracted it, stating that "the intensive .. . case study has a discipline
and a capacity to reject theories which are neglected in my caricature of the
method " (1975, p. 184).
This situation , I believe, is similar to one in the history of the study of kinship
terminologies in anthropology. Early investigators of kinship often assumed an
2 The instrumentalist approach to validity is simply one: instance of the posithist or logical empiricist
program of su bstituting logical const.ructions, based on research operatio ns and the sense data that
theygcncmte, for inferred (thaI b. theoretical ) entities (see Hunt, 1991; Manicas. 1987. p. 271; Phillips,
1990). A~ Norris (1983) points o ut, most approaches to construct validation have com bined positivist
and realist assumptions without recognizing the pmhlems that thi s inconsistency creates. One influen-
tial validity typology that i$ predominantly (and explicitly) realist is thai of Cook and Campbell ( 1979).
but even this has some fC$iduaJ positivist features (Maxwell, 19903 ).
281
Harvard EducationallUview
equivalent conceptual structure between English and the language of the society
they were studying (indeed. they often took the English terms to refer to real.
natural categories), and simply sought the equivalents for the English kinship
terms in the language being investigated. The major contribution of Lewis Henry
Morgan (1871) to the study of kinship, and the basis for nearly all subsequent
work on kinship terminology. was the recognition that societies have different
classification systems for relatives, systems that can differ markedly from those
of our own society and that cannot be represented adequately by a simple trans..
lation or correlation of their terminology with that of the English language
(Trautmann, 1987, p. 57).
This article is thus intended co be a Morgan-like refonnulation of the catego-
rization of validity in qualitative research - an account from .. the native's point
of view~ (Geertz, 1974) of the way qualitative researchers think about and deal
with validity in their actual practice. Any account of validity in qualitative re-
search, in order to be productive, should begin with an understanding of how
qualitative researchers actually think about validity. I am not assuming that qual-
itative methods for assessing validity are infallible, but a critique of these meth-
ods is beyond the scope of this paper. However, if my account of the categories
in which validity is conceived is valid. it obviously has implications for the latter
wk.
In developing these categories, I will work not only with qualitative re-
searchers' explicit statements about validity - their "espoused theory~ (Argyris
& Schoen, 1974) or "reconstructed logic" (Kaplan; 1964)-but also with the
ideas about validity that seem to me to be implicit in what they actually do -
their "theory-in-use" (Argyris & Schoen, 1974) or "Iogic-in-use" (Kaplan , ]964).
I am here following Einstein's advice that
if you want to find ou[ anything from the theoretical physicists about the
metbods they use, I advise you to stick closely to one principle: Don'1 listen to
their words, fix your attention on their deeds. (cited by Manicas, 1987, p . 242)
In this, however, I have the additional advantage that I am myself a qualitative
researcher and can draw on my own practice and my understanding of that
practice, in the same way that a linguist is able to draw on his or her own
"intuition s" about his or her native language in constructing an analysis of that
language. (l am not claiming infallib ility for such intuitions, but simply ack.nowl-
edging them as a source of data.)
I do not think that qualitative and quantitative approaches to validity are
incompatible. I see important similarities between the two, and think that the
analysis I present here has implications for the concept of validity in quantitative
and experimental research. I am, however, arguing that a fruitful comparison
of the two approaches deper.ds on a prior understanding of each of the ap-
proaches in its own terms.
282
Qualitative Research
JOSEPH A. MAXWELL
' Fo r a more ge neral critique of objectivisl views, $<:e Bemslein (1983), Bohman (1991 ), Hammenley
and Atlinson ( 198j), and Lakoff ( 1987) .
• To apply th e distinction inlTOduced previoU5\y, bc:lWeen paradigmatic (similarHy-based) and syn.
I3gmatic (contiguity-based) relationships, I am concepwalizing !he relalionship belWeen an account
and iu ~j~~' M 'nms! ~I?! ~n ~~!!1~~!!)'?: !:~!!!~!~~ f~~ ~~~~~!! ~~!!"~r.!~~~~~ ~~~:rl; ~~~
?n .conuglllty _ on. the Imphcauons and consequences of adopting and acting on a panicula.- aCcounl.
fhls approach o~ously resembles ·pragmatiM",approaches in philosophy (Kaplan 1964, pp. 42-46;
RorlY, 19~9), and IS analogous to.Helmhoh.z's ( 197\ ) view of e>;perience as a sign rather !han an image
or ref1«uon of the world (Manlcas, 1987, p. 176 If.) ; however, a disclUSion of the$<: connections is
beyond the scope of !his articl e. I ha\'e at!empled to eJ:plore some of the implications of such a
"non·rdlectionist" (McKinley, 1971 ) view el$<:where (Maxwell, 1979, 1986).
283
Harvard Educational Review
clude that the Gluecks's measurements of these variables are invalid ? In order
to answer this question , it is n ecessary to ask what the Gluecks wish to learn from
their data" (Hirschi & Selvin, 1967, p. 195).
It is possible to construe data as a kind of account - a d escription at a very
low level of in fere n ce and abstraction. In this sense, it is sometimes legitimate
to speak of the validity of d ata, but this use is derived from the primary meaning
of validity as a property of accounts. In contrast, a method by itself is neither
valid nor invalid ; methods can produce valid data or accounts in some circum-
stan ces and invalid on es in others. Validity is not an inherent proper ty of a
particular method, but per tains to t he data, accounts, or conclusio ns reached
by using that method in a particular context for a particular purpose . To speak
of th e validity of a method is simply a sh orthand way of referring to the validity
of the data or accounts derived from that m ethod.
I agree with Mishler that validity is always relative to, and dependent o n, some
community of inquirers on whose perspective the accoun t is based. Validity is
relative in this sense because understanding is relative; as argued above, it is not
possible for an account to be independent of a ny particular perspective. It is
a1ways possible to c hallenge an account fro m outside that community a nd per-
spective, but such a challe nge amounts to expanding the community tha t is
con cerned with th e account a nd may change the nature of the valid ity issues in
ways to be discussed below.
Howeve r, the grounding of all accounts in so me particular commu nity and
persp ective does not entail that all accounts a re inco mm ensurable in the se nse
of not being comparable. Bernstein ( 1983) argues in detail that the incomme n-
surability thesis has been widely misinterpreted in this way, a nd that the rejection
of objectivism does not require the adoption of extreme relativism in this sense.
"What is sound in the incommensurability thesis has nothing to do with relativism,
or at least that form of relativism wh ich wants to claim that there can be no
ratio na l comparison among the plurality of theories ... " (Bernste in , 1983, p . 92;
emphasis in the original). He quotes Winch, one of the authors most ofte n cited
in support of incommensurability:
We should not lose sight of the fact that the idea that men 's ideas and beliefs
must be checkable by reference to something independent - some reality-
is an important one. To abando n it is to plungc straight into a Protagorean
relativism , with all the paradoxes that involves. (Winch, 1958, p. 11 , cited in
Bernstcin, 1983, p. 98)
Bernstein claims that incommensurability, properly understood, is n o t a re-
j ection of comparability, n or a n abandonment of any attempt to assess the va-
lidity of accounts, but instead offers a way to compare or assess accounts that
goes beyond the sterile opposition between objectivism a nd relativism (see also
Be rnste in , 199 1, pp. 57-78).
I argu ed above that validity pertains, to th e kinds of und erstanding that ac-
coun ts can embody. I see five broad categories of understanding that are re le-
vant to qualita tive research , and five corresponding types of validity that concern
qualitative researchers. I will refer to these categories, respectively, as descriptive
284
QJtalitative &search
JOSEPH A. MAXWELL
Descriptive Validity
The first concern of most qualitative researchers is with the factual accuracy of
their account - that is, that they are not making up or distorting the things
they saw and heard. If you report that an informant made a particular statement
285
Harvard Educational Review
in an interview, is this correct? Did he or she really make that statement, or did
you mis-hear, mis-transcribe, or mis-remember his or her words? Did a parlicula,r
student in a classroom throw an eraser on a specific occasion? These matters of
descriptive accuracy are emphasized by almost every introductory qualitative
methods textbook in its discussion of the recording offieidnotes and interviews.
All of the subsequent validity categories I will discuss are dependent on this
primary aspect of validity. As Geertz puts it, "behavior must be attended to, and
with some exactness, because it is through the flow of behavior - or, more
precisely, social action - that cultural forms find articulation" (1973, p. 17) .
Wolcott, similarly, states that "description is the foundation upon which quali-
tative research is built" (1990b, p. 27) and that "whenever I engage in fieldwork,
I try to record as accurately as possible, and in precisely their words, what I judge
to be important of what people do and say" (Wolcott, 1990a, p. 128).
I will refer to this first type of validity as descriptive validity; it corresponds,
to some extent, to the category of understanding that Runciman (1983) calls
"reportage " or "primary understanding." Insofar as this category pertains to hu-
mans, it refers to what Kaplan (1964, p. 358) calls "acts" rather than "actions"
- activities seen as physical and behavioral events rather than in terms of the
meanings that these have for .t he actor or others involved in the activity.
The above quotes refer mainly to what I will call primary descriptive validity:
the descriptive validity of what the researcher reports having seen or heard (or
touched, smelled, and so on). There is also the issue of what I will call secondary
descriptive validity: the validity of accounts of things that could in principle be
observed, but that were inferred from other data - for example, things that
happened in the classroom when the researcher was not present. (This second-
ary description is also included in Runciman's concept of "reportage . ") Second-
ary descriptive validity can pertain to accounts for which the inference is highly
complex and problematic: for example, the claim that the person known as
William Shakespeare actually wrote Ham let, or that a particular stone object was
used as a cutting tool by a member of an early human population. These issues
concern descriptive validity because they pertain to physical and behavioral
events that are, in principle, observable.
There are several characteristics of these sorts of descriptive concerns that I
want to emphasize. First, they all refer to specific events and situations. No issue
of generalizability or representativeness is involved. Second, they are all matters
on which, in principle, intersubjective agreement could easily be achieved, given
the appropriate data. For example, a tape recording of adequate quality could
be used to determine if the informant made a particular statement during the
interview, a videotape could be used to decide if the student threw the eraser,
and so on.
Put another way, the terms of the description (for example. " throw ~ in the
example above) are not problematic for the community involved in the discus-
sion of the event; their meaning - how they ought to be applied to events and
actions - is o'ot in dispute, only the accuracy of the application. This situation
is quite different, for example, from the case of an account claiming that a
286
Qualitative &search
JOSEPH A. MAXWELL
287
Haroard Educational Reuiew
descriptive validity is that it does not involve statistical inference to some larger
universe than the phenomenon directly studied, but only the numerical descrip-
tion of the specific object of study. This is different from what Cook and Camp-
bell (1979) call "statistical conclusion validity," which refers to the validity of
inferences from the data to some population. I treat the latter issue below. as
one type of generalizability.
Reliability, in my view, refers not to an aspect of validity or to a separate issue
from validity, but to a particular type of threat to validity. If different observers
or methods produce descriptively different data or accounts of the same events
or situations, this puts into question the descriptive validity (and other types of
validity as well) of the accounts. This problem could be resolved either by mod-
ification of the accounts, so that different observers come to agree on their
descriptive accuracy, or by ascertaining that the differences were due to differ-
ences in the perspective and purposes of the observers and were both descrip-
tively valid. given those perspectives and purposes.
Interpretive Validity
However. qualitative researchers are not concerned solely, or even primarily,
with providing a valid description of the physical objects, events. and behaviors
in the settings they study; they are also concerned with what these objects, events,
and behaviors mean to the people engaged in and with them. In this use of the
term meaning, I include intention, cognition, affect, belief. evaluation . and any-
thing else that could be encompassed by what is broadly termed the
"participants' perspective," as well as communicative meaning in a narrower
sense. This construction is inherently ideational or mental, rather than physical,
and the nature of the understanding, validity, and threats to validity that pertain
to it are significantly different from those involved in descriptive validity.
I will call this sort of understanding interpretive. and the type of validity
associated with it interpretive validity, foll owing Erickson (1989).5 The term "in-
terpretive " is appropriate primarily because this aspect of understanding is mOSt
~ As Eisner (1991, p. !IS) note5. th e tenn - interpretive" has two meanings in qualilative re!.ean:h.
One is the meaning I am adopting here; the other refers to studie s that attempt to explain. a$ we ll as
dellCTibe, the things that they study. The latter use i$ ~imilar to that of Merriam ( 1988. pp. 27-28) and
Patton (1990, p. 42!1) , wh o use -deilCriptive" for studies that do n ot attempt to develop or apply explicit
theory and reserve "interpretive- for studies that generate theory or interpret the dala from some
theoretical perspective. This second use of "interpretive" corresponds to the type of undel1llanding
that 1 tenn "theoretic al." Kaplan's di sti nction between -semantic explanation," or inte rprelation, and
·sci"ntific explanation" is similar in some w ..ys to my distinction between inte'l'retation and th eory.
but my use of "theory" includes semantic as well as expl icitly aplanatory theories.
My Wle of "interpretation" correspon ds, confusingly, to what Runciman (1983) calls "de5C ription."
The latte r term may be related to Geertt's (197!1) phl'1llle "thick des.:ription," which h as been widely
employed in di i\l:uuions of interpretive resean:h. Thick d escription, for (;eerlZ, is mtaningtiJ descrip-
tion - .t hat is, the d escrip tion embedded in the cultural framework of th e actors; the tenn docs not
refer 10 die richnes1l or detail oftht: account. Thi.'J point i.'J often misunderstood, in part because Geerlz
uses the phrase precisely to avoid making the d is tin ction between dellCriptive and interpretive under_
Slanding that 1 have drawn here: between physicai/behaviora1 dellCription, on the o ne hand. and
inference to meaning as a mental phenomenon on the other. ThIlS, "thick de!ll:ription " pertains to
interpretive as well as to dellCriptive undeo landing. as I have define d these temu.
288
Q!.I.alitative &sMrch
JOSF.PII A. MAXWF.LL
central to interpretive research, which seeks to compre hend phe nomena not on
the basis of the researcher's perspective and categories, but from those of the
participants in the situations studied - that is, from an "ernic" rather than an
"etic" perspective (Bohman, 1991; Headland, Pike, & Harris, 1990) . In contrast
to descriptive validity, which could apply equally well to quantitative and quali-
tative research, interpretive validity has no real counterpart in quantitative-
exp erimental validity typologies.
Thus, while the terms involved in descriptive validity can be either etic or
ernic, interpretive validity necessarily pertains to aspects of an account for which
the terms are ernie. This is because , while accounts of physical and behavioral
phe nome na can be constructed from a variety of perspectives, accounts of mean-
ing must be based initially on the conceptual framework of the people whose
meaning is in question. These terms are often derived to a substantial extent
from the participants' own language. The terms are also necessarily, to use
Geertz's phra~e ( 1974), "experience-near" - based on the immediate concepts
employed by participants (for example, "love"), rather than on theoretical ab-
stractions (for example, "object cathexis").
Like descriptive validity, then, interpretive valid ity, while not atheoretical,
refers to aspects of accounts for which the terms of the account are not them-
selves problematic. Interpretive accounts are grounded in the language of the
people studied and rely as much as possible on their own words and concepts. 6
The issue, again, is not the appropriateness of these concepts for the account,
but their accuracy as applied to the perspective of the individuals included in
the account. For example, was the teac he r, in yelling at the stud e nt for throwing
the eraser, really "mad" at the student, or just trying to "get control" of the class?
It is ironic that Geeru adopted this ter-m, and it$ associate d philosophical argument, as a c harac-
terization of interpreti~e research, because Gilbert Ry1c: , who coined the term in his work TM O:mcept
of Mind (1949), used it as part of an explicit attempt to eliminate mental concept$ (what hc referred
to as -the ghost in the machine-) from philosophy and to replace them with d~positional5tatements
re ferring to an individual' s propensity to behave in particular wa~. This approach, whic h came to be
known as -logical behaviorism,- was .. classically positi~~t strategy of replacing theoretical e ntities with
logical construction s based on observablell. (As noted above, thi s strategy was the Achilles' hed o f
positivism.)
This positivist view that mental constructs are theoretical a~tTaction$ that ultimately Ttfn 10 behavior
and behavioral dispositions is quite different from the realist po.1ition I take here, that such mental
constructs refer to unobservable but real entities whose existence is infnTni from observations of
behavior (see Manicas, 1987).
Within the C'<ltegory of interpre ti~e undentanding, it is possible to make a distinction, similar to
that betwee n primill"}' and secondary descriptive undel1ltanding , between the ,,,,,,,,,,mica/ive meaning
of speech or actions (which is nonetheless alW'<lYs meaning for iIOme actor o r interprete r) and the
aClor's su bjective inte ntions, beliefs, values, and perspective (see Gellner, 1962; Hannen, 1992,
pp. 3-4; Keesing, 1987, pp. 174-175; RicQeur. 1981) . Both of th ese types of undentanding are ul ti-
mately b~ on inferences from th e de5Cripti~e evidence, but the validity of inferences to the actor' s
subjective states depends on the v-.uidity of the research er's account of the meaning of the actor's words
and actions. My own a1ternati\'e to Geeru's attempt to get meaning out of the -sec ret grotto in the
skull" and into the public world rests on this distinction between meaning as a property of dis.coune,
on the onc hand , and the actor's subjecti~e states, on the other.
61n providing a lI',Ilid account of indi\iduab who lack such an accessible language, suc h as preverbal
children, interpretive validity merges with the following category, theoretical validity.
289
Haroard Educational Review
While the relevant consensus about the categories used in de scription rests in
the research community, the relevant consensus for the terms used in interpre+
tation rests to a substantial extent in the community studied.
Unlike descriptive validity, however, for interpretive validity there is no in-
principle access to data that would unequivocally address threats to validity.
Interpretive validity is inherently a matter of inference from the words and ac-
tions of participants in the situations studied. The development of accounts of
these participants' meanings is usually based to a large extent on the
participants' own accounts, but it is essential not to treat these latter accounts
as incorrigible; participants may be unaware of their own feelings or views, may
recall these inaccurately, and may consciously or unconsciously distort or con-
ceal their views. Accounts of participants' meanings are never a maiter of direct
access, but are always construckd by the researcher(s) on the basis ofpanicipants'
accounts and other evidence. 7
The realist approach to validity that I am adopting h ere has been held by
some interpretive researchers to be incompatible with a concern for interpretive
understanding. For example, Lincoln has argued that "critical realism's assump-
tion that there is a singular reality 'out there' ... ignores the issue of whether
that reality is recognized or rejected by those who may be disadvantaged by that
construction" (1990, p. 510).
This critique misses the point that the meanings and constructions of actors
are part of the reality that an account must be tested ag,!inst in order to be
interpretively as well as descriptively valid. Social theorists generally agree that
any valid account or explanation of a social situation must respect the perspec-
tives of the actors in that situation, although it need not be centered on that
perspective (Bohman, 1991; Harre, 1978; Menzel, 1978). My inclusion of inter-
pretive validity in this typology is a recognition of this consensus: that a key part
of the realm external to an account is the perspective of those actors whom the
account is about (see House, 1991) .
Interpretive validity does not apply only to the conscious concepts of partici-
pants; it can also pertain to the unconscious intentions, beliefs, concepts, and
values of these participants, and to what Argyris and Schoen (1974) call "theocy-
in-use," as opposed to "espoused theory." However, this aspect of interpretive
' I can not deal systematially here with one challenge to this approach e mbodied in the post-
structuralist slogan that ~there is nothing oUl$ide the text.~ I agree with Manicu that this view repre-
senu not just a repudiation of realist conceptions ofw.lidity, but "an epistemological nihilism in which
truth is an illusion" (1987, p. 269). I would also argue that, ironically, this approach exemplifies the
same goal of eliminating inferred entities that char.«:terized positivism and is vulnerable 10 the $aIlle
critiq ues that led to positivism's demise.
I want to e mph;uize that my distinction between descriptive and inU!rpretivc validily is not belWeen
th e -real world" and actors' constructions of that world. First, both descriptive and inU!rpretive under-
standing pertain to the researcher' s accounts of the world - that is, 10 accouill.s of its physical/ behav-
ioral and mental aspects Or compone nts, respectively. Both types of accounts are the researcher's
constructions_ Second, the physical and mental componenu refer to entities that are equally real, rather
than one being a reflection of the other. (For a more detailed analysis of the relationship between the
mental and physieal framewor\u, see MaxweU, 1986.)
290
0taiilative Research
JOSEPH A. MAXWELL
validity also raises another category of understanding and validity, which, follow-
ing Kirk a nd Miller (1986), I will call "theoretical validity."
TheQretical Validity
The two previous types of understanding have a number of similarities. First,
they depend on a consensus within the relevant community about how to apply
the concepts and terms used in the account; any disagreements refer to their
accuracy, not their meaning. Second, and closely connected to the first, the
concepts and terms employed are "experience-near," in Geertz's sense (1974).
There are two major differences between theoretical understanding and the
twO types discussed previously. The first is the degree of abstraction of the ac-
cou nt in question from the immediate physical and mental phenomena studied.
The reason for calling this sort of understanding theoretical is that it goes be-
yond concrete description and interpretation and explicitly addresses the theo-
retical constructions that the researcher brings to, o r develops during, the study.
This theory can refer to either physical events or mental constructions. It can
also incorporate participants' concepts and theories, but its purpose goes be-
yond simply describing these participants' perspectives. This distinction com-
prises the second major difference between the theoretical validity of an account
and the descriptive or interpretive validity of the same account: theoretical un-
derstanding refers to an account's function as an explanation, as well as a descrip-
tion or interpretation, of the phenomena.
Theoretical validity thus refers to an account's validity as a theory of some
phenomenon. Any theory has two components: the concepts or categories that
the theory employs, and the relationships that are thought to exist among these
concepts. Corresponding to these two aspects of a theory are two aspects of
theoretical validity: the validity of the concepts themselves as they are applied
to the phenomena, and the valid ity of the postulated relationships among the
concepts. The first refers to the validity of the blocks from which the researcher
builds a model, as these are applied to the setting o r phenomenon being studied;
the second refers to the validity of the way the blocks are put together, as a
theory of this setting or phenomenon.
For example, one could label the student's throwing of the eraser as an act
of resistance, and connect this act to the repressive behavior or values of the
teacher, the social structure of the school, and class relationships in U.S. society.
The identification of the throwing as "resistance" constitutes the application of
a theoretical construct to the descriptive and interpretive understanding of the
action; the connection of this to other aspects of the participants, the school, or
the community constitutes the postulation of theoretical relationships among
these constructs.
Th e first of these aspects of theoretical validity closely ~atches what is gener-
ally known as construct validity, and is primarily what Kirk and Miller (1986)
mean by theoretical validity. The second aspect includes, but is not limited to,
what is commonly called internal or causal validity (Cook & Campbell, 1979); it
corresponds to what Runciman calls ~explanation" and in part to what Erickson
291
Harvard Educational Rnljew
calls "critical validity."11 T his second aspect is not limited to causal validity be-
cause theories or models can be developed for other things besides causal ex-
planation - for example, for semantic relationships, narrative structure, and so
on - that nevertheless go beyond description and interpretation . Theories can,
and usually do, incorporate both descriptive and interpretive understanding, but
in combining these they necessarily transcend either of th em.
What counts as theoretical valid ity, rather than descriptive or interpretive
validity, depends on whether there is consensus with in the com munity con-
cerned with the research about the terms used to characterize the phenome na.
Issues of descriptive and interpretive validity focus on the accuracy of the app li-
cation of these terms (Did the student really throw the erase r? Was the teacher
really angry?) rather than th eir appropriateness (Does what the student did
count as resistance?). Theoretical validity, in contrast, is concerned with prob-
lems that do not disappear with agreement on the "facts" of the situation; the
issue is the legitimacy of the application of a given concept or theory to estab-
lished facts , or indeed whether any agreement can be reached about what the
facts are.
The distinction between descriptive or interpretive and theoretical validity is
not an absolute, because (contrary to the assumptions of positivism) objective
"sen se data" that are independent of the researcher's p erspective, purposes, and
theoretical framework do not exist. My distinction between the two types is not
based on any such assumption, hut on the presence or absence of agree ment
within the community of inquirers about the descriptive or interpretive terms
used. Any challenge to the meaning of the tenns, or the a ppropriateness of the ir
application to a given phenomenon , shifts the v-dlidity issue from descriptive or
interpretive to theoretical.
These three types of understanding and validity are the ones most direcLly
involved in assessing a qual itative account as it pertains to the actual situation
on which the account is based. There are, however, two additional categories of
292
Qy.alitalive [U!earch
JOSEPII A. MAXWELL
validity issues that I want to raise. The firs t of these deals with the gene ralizability
of an account, or what is often labeled external validity; the second pertains to
the evaluative validity of an account.
GeneralizabililJ
Generalizability refers to the extent to which one can extend the account of a
particular situation or population to other persons, times, or settings than those
directly studied. This issue plays a different role in qualitative research than it
does in quantitative and experimental research, because qualitative studies are
usually not designed to allow systematic generalizations to some wider popula-
tion. Generalization in qualitative research usually takes place through the de-
velopment of a theory that not only makes sense of the particular persons or
situatio ns studied, but also shows how the same process, in different situations,
can lead to different resu lts (Becker, 1990, p . 240). Generalizability is normally
based on the assumption that this theory may be useful in making sense of
similar persons or situations, rather than 011 an explicit sampling process and
the drawing of conclusions about a specified population through statistical in-
ference (Vin, 1984).
This is not to argue that issues of sampling, representativeness, and
generalizability are unimportant in qualitative research. They are crucial when-
ever one wants to draw inferences from the actual persons, events, or activities
observed to other persons, events, or situations, or to these at other times than
when the observation was done. (The particular problems of interviewing will
be dealt with below.) Qualitative research almost always involves some of this
sort of inference because it is impossible to observe everything, even in one small
setting. The sort of sampling done in qualitative research is usually ~purposeful"
(Patton, 1990) or "lheoretical~ (Strauss, 1987) sampling, rather than random
sampling or some o ther method of attaining statistical representativeness. The
goal of the former types of sampling is twofold: to make sure one has adequately
understood the variation in the phenomena of interest in the setting. and to test
developing ideas about that setting by selecting phenomena that are crucial to
the valid ity of those ideas.
In qualitative researc h , there are two aspects of generalizability: generalizing
within the community, group, or institution studied to persons, events, and set-
tings that were not directly observed or interviewed; and generalizing to other
communities, groups, or institutions. I will refer to the former as internal
generalizability, and to the latter as external generalizability. The distinction is
analogous to Cook and Campbell 's (1979) distinction in quasi-experimental re-
search between statistical conclusion validity and external validity. This distinc-
tion is nOt clear-cut o r absolute in qualitative research. A researcher studying a
school, for example, can rarely visit every classroom, or even gain information
about these classrooms by other means, and the issue of whether to consider the
gen eralizability of the account for those unstudied classrooms internal or exter-
nal is moot. However, it is important to be aware of the extent to which the times
293
Harvard Educational Review
and places observed may differ from those that were not obselVed, either be-
cause of sampling or because of the effect of the observation itself.!~
Internal generalizability in this sense is far more important for most qualita-
tive researchers than is external gene ralizability because qualitative researchers
rarely make explicit claims about the external generalizability of their accounts.
Indeed, the value of a qualitative study may depend on its lack of external
generalizability in a statistical sense; it may provide an account of a setting or
population that is illuminating as an extreme case or "ideal type." Freidson,
discussing his qualitative study of a medical group practice, notes that
there is more to truth or validity than statistical representative ness .... In this
study I am less concerned with describing the range of variation than I am with
describing in the detail what survey questionnaire methods do not permit to
be described - the assumptions, behavior, and attitudes of a very special set
of physicians. They are interesting because they were speciaL (1975, pp. 272-
273)
He argues that his study makes an important contribution to theory and policy
precisely because this was a group for whom social controls on practice should
have been most likely to be effective. The failure of such controls in this case
not only elucidates a social process that is likely to exist in other groups, but also
provides a more persuasive argument for the unworkability of such controls than
would a study of a ~representative" group.
Interviewing poses some special problems for internal generalizability because
the resea rcher usually is in the presence of the person intelViewed only briefly,
and must necessarily draw inferences from what happened during tha t brief
period to the rest of the informant's life , including his or her actions and per-
spectives. An account based on interviews may be descriptively, interpretively,
and theoretically valid as an account of the person's actions and perspective in
that intelView, but may miss other aspects of the person's perspectives that we re
not expressed in the interview, and can easily lead to false inferences about his
or her actions outside the interview situation . Thus, internal generalizability is
a crucial issue in interpreting interviews, as is widely recognized, for example,
g The difference between se<:ondary descriptive validity and internal generalizability desen'Cs clar-
ification, bccau.sc the two !leem superficially similar. Both refer to the extension of one's account to
things that were not directly observed but that remain within the setting or group 5tudied. The
difference between the two is based on the kind of relationship postulated between th e immediate data
or acCOlllU and the claim who!le validity is in question. In internal (as well a5 external) ge neralizability,
the claim is that those thin~ not direc tly observed are si.mtar to thO!le described in the account; !.hat
!.he account can be gn.emliud to some wider context. For !Ie~ondary descriptive: validity, the issue is not
similarity, but the validity of !.he chain of inftrma from one's primary data to things that wue not
directly ob!lerved - for example, whe th er one can infer, from an eraser on the floor, a ~halk mark on
the wall above: it, and a student's sullen silen~e about what had happened when the resear~her briefly
left the r oom, !.hat an era.!ler was thrown. There is no 'a5liumption that the primary data rese mble or
gcner.tlize to the secondary ~onclusion in any way, only that the inferential connection bclWCen the
two is valid. In addition, intc!mal (and cxternal) gener.alu.ability can pertain to interpretive, theoretical,
o r evaluative: conclusions, a5 well as de!l(riptive ones.
294
Qp.alitative &search
JOSEPH A . MAXWELL
by Dean and Whyte (1958) and Dexter (1970) .10 The intelView is a social situa-
tion and inherently involves a relationship between the intelViewer and the in-
formant. Understanding the nature of that situation and relationship, how it
affects what goes on in the interview, and how the informant's actions and views
could differ in other situations is crucial to the validity of accountS based on
interviews (Briggs, 1986; Mishler, 1986).
Evaluative Validity
Beyond all of the validity issues discussed above are validity questions about such
statements as, "The student was wrong to throw the eraser at the teacher," or,
"The teacher was illegitimately failing to recognize minority students' perspec-
tives." This aspect of validity differs from the types discussed previously in that
it involves the application of an evaluative framework to the objects of study,
rather than a descriptive, interpretive, or explanatory one. It corresponds to
Runciman's ~evaluation" as a category of understanding (1983) , and is an im-
portant component of what Erickson (1989) terms ~critical validity."
I have little to say about evaluative validity that has not been said more co-
gently by others. In raising it here, my purpose is twofold: to acknowledge eval-
uative validity as a legitimate category of understanding and validity in qualitative
research, and to suggest how it relates to the other types of validity discussed.
Like external generalizability, evaluative validity is not as central to qualitative
research as are descriptive, interpretive, and theoretical validity: many research-
ers make no claim to evaluate the things they study. Furthermore, issues of
evaluative understanding and evaluative validity in qualitative research do not
seem to me to be intrinsically different from those in any other approach to
research; debates about whether the student's throwing of the eraser was legiti-
mate or justified do not depend on the methods used to ascertain that this
happened or to decide what interpretive or theoretical sense to make of it,
although they do depend on the particular description, interpretation, or theory
one constructs. To raise questions about the evaluative framework implicit in an
account, however, as many critical theorists do, creates issues of an account's
evaluative validity, and no account is immune to such questions.
Implications
I have presented a model of the types of validity that I believe are relevant to,
and often implicit in, qualitative research. I have approached this task from a
realist perspective, and have argued that this realist approach, which bases va-
lidity on the kinds of understanding we have of the phenomena we study, is
more consistent and productive than prevailing positivist typologies based on
,. I have indicate d e arlie r my di!.agreement with the · approach that deals with this problem by
denyi ng it _ that is. by treatin& Ihe in tervie w (or (';Ven !he interview tr... ns.:ript) as a "t"xC and as.xrting
!hat it is illegitimat" to at te mpt to make inf" r"nces to $Om " -real - actOr.
295
Harvard Educational Rruiew
research procedures. A realist view of validity both avoids the philosophical and
practical difficulties associated with positivist approaches and seems to me to
better re present what qualitative researchers actually do in assessing the validity
of their accounts.
However, having presented this typology, I must add that validity categories
are of much less direct use in qualitative research than they are (or a re assumed
to be) in quantitative and experimental research. In the latter, threats to validity
are addressed in an anonymous, generic fashion by prior design features (suc h
as randomization and controls) that can deal with both anticipated and un-
anticipated threats to validi ty. In qualitative research, however, such prior elim-
ination of threats is less possible, both because qualitative research is more in-
ductive and because it focuses primarily on understanding particulars rather
than generalizing to universals (Erickson , 1986). Qualitative researchers deal
primarily with specific threats to the validity of particular features of their ac-
counts, and they generally address such threats by seeking evidence that would
allow them to be ruled out. In doing this, they are using a logic sim ilar to that
of quasi-experimental researchers such a~ Cook and Campbell (1979).
This strategy of addressing particular threats to validity, or alternative hypoth-
eses, after a tentative account has been developed, rather than by attempting to
eliminate such threats through. prior features of the research design, is in fact
more fundamental to scientific method than is the latter approach (Campbell ,
1988; Platt, 1964). This method is accepted by qualitative researchers from a
wide range of philosophical positions (for example, Eisner, 1991; Hammersley
& Atkinson, 1983; Miles & Huberman, 1984; Pauon, 1990). Its application to
causal inference has been labeled the "modus operandi" approach by Scriven
(1974), but the method has received little formal development in the qualitative
research literature, although it is implicit in many substantive qualitative studies.
Thus, researchers cannot use the typology presented he re to eliminate, di-
rectly and mechanically, particular threats to the validity of their accounts. Quali-
tative researchers already have many methods for addressing validity threats,
and, although there are ways that the state of the art could be improved (see
Eisenhart & Howe, 1992; Miles & Huberman, 1984; Wolcott, 1990a), that is not
my main goal here . Instead, I am trying to clarify the validity concepts that many
qualitative researchers are using - explicitly or implicitly - in their work, to
tie these concepts into a systematic model, and to reduce the discrepancy be-
tween qualitative researchers' "logic-in-use " and their ~ reconstructed logic ~
(Kaplan 1964, pp. 3-11) - a discrepancy that I think has caused both substantial
misunderstanding of qualitative research and some shortcomings in its valida-
tion practices. I see this typology as being useful both as a checklist of the kinds
of threats to validity that one needs to consider and as a framework for thinking
about the nature of these threats and the possible ways that specific threats might
be addressed.
I d o not see the typological framework presented in this article as antithetical
to the exemplar-based approach that Mishler has advocated. In fact, one of my
main assumptions is that category-based and context-based approaches to qual-
296
(blalitative ReuaTch
J OSF.PH A. MAXWELL
itativc research in general, and to validity in particular, are both legitimate, and
that these are compatible and complementary, rather than competing, alterna-
tives (Maxwell & Miller, n .d.). The ways in which these two approaches could be
used in combination is a topic beyond the scope of this articl e, but 1 hope that
the analysis I have presented is helpful in facilitating this rapprochement.
References
Argyris, C., & Schoen, D. A. (1974). Theory in practiu: Inmasing professional effutivmWi. San
Francis<:O: Jossey-Bass.
Becker. H. S. (1970). Problems of inference and proof in participant observation. In H. S.
Becker (Ed.), Sociological u.ror-h: Method and substanu (pp. 25-58). New Brunswick, NJ:
Transaction Books.
Becker, H. S. (1990). Generalizing from case studies. In E. W. Eisner &. A. Peshkin (Eds.),
Qualitative research in educalion: The continuing deoou (pp. 233-242). New York: Teachers
College Prcss.
Bernstein, R. J . ( 1983). Beyond objectivism and relativism. Philadelphia: Univcrsity of Pennsyl-
vania Press.
Bernstein, R. J. (1991). The new conslelhltion: The ethieal.political horizons of modrrnity/prut.
modernity. Cambridge, MA: MIT Press.
Bhaskar, R. (1989). Reclaiming realily. London: Verso.
Bohman,]. (1991 ) . New philosophy of sodal science. Cambridge, MA: MIT Press.
Bosk. C. (1979) . Furgive and remember: Managing medical failure. Chicago: University of
Chicago Press.
Briggs, C. ( 1986). Learning h()VJ 10 ask: A sociolinguistic appraisal of the roIL of the interuitw in
social samet mtarch. Cambridge, Eng.: Cambridge University Press.
Brinberg, D. , & McGrath,J. E. (1985) . Validity and the rtstarch process- Newbury Park, CA:
Sage.
Bruner, ]. (1986). Two modes of thought. In J. bruner (Ed.), Actual minds, possibk worlds
(pp. 11-43) . Cambridge, MA: Harvard University Press.
Campbell , D. T. (1975). MDegrees of fTeedom" and the case study. Ccmpamtivt: Political
Swdies, 8(2), 178-193. (Reprinted in Campbell, 1988).
Campbell , o. T. (1988). Methodology and epislemowgy for social scitnet: &kcled papers. Chicago:
University of Chicago Press.
Campbell , D. T., & Stanley, J. (\963). Experimental and quasi-experimental designs for
research on teaching. In N. L. Gage (Ed.), Handbook ofresto.rch.m Ifflching (pp. 171-246).
Chic.ago: Rand McNally.
Cook, T. D. , & Campbell, D. T. ( 1979). Q!Jasi·erprnI1lmUition: Design and analysis imusfur
fold settings. Boston: Houghton Mimin.
Dean , J., & Whyte, W. F. (1958). How do you know if the informant is telling the truth?
I-/uman Organization, 17(2),34-38. ( Reprinted in L. A. Dexter, 1970, pp. 119-131)
Dexter, L. A. (1970). Elile and specialiud interuitwing. E....dnslOn, II.: Northwestern University
Press.
Eisenhart, M., & Howe, K. (1992). Validity in educational research In M. O. LeCompte, W.
I~ Millroy, &]. Preissle (Eds.), The handbook oj qualitative mearch in educalion (pp.
643--680). San Diego: Academic Press.
Eisner, E. W. (199 1). The enligh/e1led eye: Qualitali~ inquiry and. 1M Imhanumenl of educational
practiet. New York: Macmillan.
Erickson F. (1986). Qualitative me thods in research ori teaching. In M. C. Wittrock (Ed.),
I-/andbook of research on teach!ng (3rd ed.) . New York: Macmillan.
Erickson, F. (1989, March) . The meaning of validity in qualitative rtstarch. Unpublished paper
presented at the annual meeting of the American Educational Research Associa tion,
San Francisco.
297
Harvard Educational Rruiew
Freidson, E. (1975). Doc/Uring kWtJrer: A study of professiorw.l social controL Chicago: University
of Ch icago Press.
Geertz, C. (197.!J). Thick description: Toward an inlerpretive theory of c ulture. In C. Geeru,
The intt-rprt tation of culturtS (p p. !--.!JO). New York: Basic Books.
Geeru, C. (1974). ~Fro m the native's point of view·: On the nature of anthropological
understanding. Bulletin of tht American Academy of Am and &iences, 28(1), 26-45. (Re-
prin ted in K. H . Basso & H . A. Selby, Eds., 1976, Mtaning in anthropology, pp. 22 1-237.
Albuquerque: University of New Mexico Press)
Gellne r, E. (1962) . Omcepts and sowty. Transactions of the Fifth World CongrtSS ofSociokJgy. Vol
J (pp. 153-183). Louvain, Belg.: International Sociological Association. ( Reprinted in
8 . R. Wilson, Ed., 1970, Rationality, pp. 18-49. London: Harper & Row )
Goeu, J. P., & LeCompte, M. O. (1984). Ethnography and qualitativt tUsign in Mucational
rtUarch. San Diego: Academic Press.
Cuba, E. G., & Lincoln, Y. S. ( 1989). Fourth gtnmltion eualuation. Newbury Park, CA: Sage.
Hammersley, M. ( 1992). What! = g with tthrwgraph" London: Roudedge.
Hammersley, M ., & Atkinson, P. (1983). Ethnograph,: Principln in prauiu.. London: Tavi-
stock.
Hannerz, U. (1992). Cultural compUxiI]: Studies in the social orgoniUJtion of mwning. New York:
Columbia University Press.
Harre, R. (1978). Accounts, actions, and meanings - The practice of participatory psy-
chology. In M. Brenner, P. Marsh , & M. Brenner (Eds.), The social CDnUxts ofntdhod. New
York : St. Manin's Press.
Headland, T. N., Pike, K.. L., & Harris, M. (Ed!.). ( 1990) . EmitJ and diu: The insitkr/outsider
tkbaU. Newbury Park, CA: Sage.
Helmholu, H. ( 1971 ). &kcttd writings (R. Kahn , Ed.) . Middletown , Cf: Wesleyan Un i\'ersity
Press.
Hirschi, T., & Selvin, H . C. (1967). Prindpll;s ofsu.rwy analysis. New Yo rk: Free Press.
House, E. (1991). Realism in research. Educational &uarcher, 20(6),2-9.
Hunt, S. D. (1991) . Positivism and paradigm dominance in consumer research: Toward
critical pluralism a nd rdpproehement.}oumal of Consumer lUuarch, 18,32-44.
jakobson, R. (1956). Two aspects of language and two types of aphasic disturba nce. In R.
Jakobson & M. Halle (Eels.), Fundamentals oflal1guagt (p p. 55-82). The Hague: Mouto n .
(Reprinted in R. jakobson, 1987, Languagt in literature. Cambridge, r-.tA: Harvard Uni-
versity Preu)
Kotplan , A. (1964). 7M conduct of inquiry. San Francisco: Chandler.
K.eesing, R. (1987) . AnthropolOgy as interpretive quest. Current Anthropology, 28, 161-176.
Kitcher. P., & Salmon, W. C. (Eds. ). (1989). ScUntific explanation. Minneapolis: University
of Minnesota Press.
Kirk,J., & Miller, M. (1986). Rrliability and validit1 in qual;tativt research. Newbury Park, CA:
Sage.
Kuhn , T . S. ( 1970). Tlu structure of scUntifu; rtvoiutions (2nd ed.). Chicago: University of
Chicago Press.
Kva1e , S. ( 1989). Introduction. In S. Kvale (Ed.), Issues of valid;1] in qtwlitaliw rtStarch. Lund,
Sweden: Studendiueratur.
ukoff, G. ( 1987). WOIllt11, fire, and dangt:rous things: Whal cakff!ria rtvtal about the mind.
Chicago: Uni~rsity of Chicago Press.
Lincoln , Y. S. (1990). Campbell's retrospective and a consuuctiviSt's perspective. Harvard
Educational R4vi~, 60, 501-504.
Macintyre A. (1967). The idea of a social science. AristOUlian Society Supplement, 41, 9.!J-1 14.
Manicas, P. T . ( 1987). A history and philosoph, of the social sciences. O)lford, Eng.: Blackwell.
Maxwell,j. A. (1979) . The evolution of Plains India n kin term inologies: A non-reflectioniU
account. Plains AnthrOpologist, 23(79). 1.!J-29.
Maxwell,J. A. (1986). The conupl.twliz,ation of ltinship in an Inuit communil]: A cultural account.
Unpublished doctoral d issertation, University of Chicago.
298
fbtalitative Research
JOSEPH A. MAXWELL
299
Haruard Edu.cationaL Rruiew
300