Inductive Reasoning - Experimental, Developmental, and Computational Approaches (PDFDrive)
Inductive Reasoning - Experimental, Developmental, and Computational Approaches (PDFDrive)
INDUCTIVE REASONING
Evan Heit is currently Professor of Psychology and Cognitive Science, and Found-
ing Faculty, at the University of California, Merced. Previously, Dr. Heit was on
the faculty in the Psychology Department of the University of Warwick, UK. He
has undergraduate degrees in computer science and psychology from the Univer-
sity of Pennsylvania and a Ph.D. from Stanford University. He also carried out
postdoctoral research at the University of Michigan and Northwestern University.
Professor Heit has published more than fifty papers on the psychology of reason-
ing, memory, and categorization. His research has been funded by the National
Science Foundation, the National Institutes of Health, the Economic and Social
Research Council (UK), and the Biotechnology and Biological Sciences Research
Council (UK). He is currently on the editorial board of Memory and Cognition
and the Journal of Experimental Psychology: Learning, Memory, and Cognition and
is Associate Editor of the Journal of Memory and Language.
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
Inductive Reasoning
Experimental, Developmental, and
Computational Approaches
Edited by
AIDAN FEENEY
Durham University
EVAN HEIT
University of California, Merced
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo
Cambridge University Press has no responsibility for the persistence or accuracy of urls
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
Contents
v
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
vi Contents
Index 345
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
List of Figures
List of Tables
ix
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
List of Contributors
Preface
Books on induction are rare; in fact, you are holding in your hands the first
book devoted solely to the psychology of inductive reasoning in twenty years.
And yet induction is a central topic in cognitive science, fundamental to
learning, discovery, and prediction. We make inductive inferences every time
we use what we already know to deal with novel situations. For example,
wherever you found this book – in your university library, online, or in an
academic bookshop – before picking it from the shelf or clicking on a link,
you will have made some inductive inferences. Amongst other things, these
inferences will have concerned the book’s likely content given its title, its
editors, its publisher, or its cover. Common to all of these inferences will have
been the use of what you already know – about us, or the topic suggested
by our title, or Cambridge University Press – to make predictions about the
likely attributes of this book.
It is not only in publishers’ catalogues that attention to induction has
been scant. Despite its obvious centrality to an understanding of human
cognition and behaviour, much less work has been carried out by psychologists
on induction than on deduction, or logical reasoning. As a consequence,
although there have been several edited collections of work on deduction
and on decision making (Connolly, Arkes, & Hammond, 2000; Leighton &
Sternberg, 2004; Gilovich, Griffin, & Kahneman, 2002; Manktelow & Chung,
2004), to the best of our knowledge there has never previously been an edited
collection of papers on the psychology of inductive inference.
To attempt to redress the balance somewhat, in 2004 we organised a sympo-
sium on induction at the Fifth International Conference on Thinking, which
was held in the historical setting of the University of Leuven in Belgium.
The series of international conferences on thinking, of which the meeting
in Leuven was part, has been primarily concerned with deductive inference,
problem solving, and decision making. Our symposium was intended to raise
xiii
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
xiv Preface
the profile, in Europe particularly, of work on induction. Many, but not all, of
the chapter authors for this book talked at the symposium, which took place
on Saturday 24 July. Doug Medin, Pat Shafto, Paul Thagard, Brett Hayes, Josh
Tenenbaum, Evan Heit, and Aidan Feeney all talked about their work on in-
duction, and Steven Sloman was the discussant. John Coley, Mike Oaksford,
and Bob Rehder were also present and contributed greatly on the day, and
Lance Rips had also been vesting Leuven. The symposium went so well that
we decided to press on with plans for a book. But what kind of book should
it be?
The last book devoted wholly to the psychology of induction was Hol-
land, Holyoak, Nisbett, and Thagard’s landmark book on induction, which
appeared in 1986. Broadly speaking, Holland and colleagues’ book attempted
to apply just one general explanatory framework to a variety of inductive
phenomena including analogy, scientific discovery, generalisation, and rule
learning. Since 1986 the field has changed and developed enormously. There
have been great accomplishments in the study of how inductive reasoning
develops, as well as a focus on precise mathematical modelling of induc-
tion throughout the 1990s, culminating in very powerful Bayesian models
of inductive thinking that have emerged over the last few years. Other de-
velopments include a recent focus on the relationship between induction
and deduction, widespread consideration of the role of categories and back-
ground knowledge in induction, and significant progress on questions about
individual and cultural differences in inductive generalisation.
To emphasise how the field has changed since the mid-1980s, and because
of the range of work on induction that has emerged in the intervening time,
this book reverses the tactic employed by Holland and colleagues; instead of
applying one approach to a variety of inductive phenomena, for the most part
this book will focus on how a wide range of methodological and modelling
approaches have increased our understanding of induction.
There is often a concern that edited books can lack coherence. Given that
we wished to collect contributions on a wide range of approaches, this could
have been a problem here. However, as most chapter authors presented at,
or attended, the symposium in Leuven, we were all aware while writing our
chapters of the concerns of other chapter authors. In addition, every chapter
was read by two other chapter authors and every chapter was redrafted in the
light of other authors’ comments. Our goal throughout has been to achieve
integration and overlap between chapters, and we hope that we have achieved
coherence.
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
Preface xv
xvi Preface
Preface xvii
xviii Preface
one of the authors of the 1986 book on induction, and his chapter brings
an especially broad perspective to the study of thinking. As is apparent from
this chapter, there are many types of thinking, and Thagard’s comments on
the likely frequency of deductive and inductive inference in everyday life are
sobering.
Chapter 10, by Lance Rips and Jennifer Asmuth, although explicitly about
a rarefied form of reasoning in mathematics, is ultimately concerned with the
relationship between deduction and induction. Previous work (Rips, 2001)
suggests that inductive and deductive reasoning are dissociated. Rips and
Asmuth consider the case of mathematical induction, which they view as a
form of deductive thinking. Interestingly, even quite advanced students of
mathematics have problems with this form of reasoning.
Mike Oaksford and Ulrike Hahn in Chapter 11 argue for a probabilistic
approach to argument evaluation. They claim that induction and deduction
may be treated as belief-updating problems, although they do not rule out
the possibility that some small amount of deduction takes place in the world.
Because of its Bayesian character, this chapter is complementary to the chapter
by Tenenbaum and colleagues. Also notable in this chapter is an attempt
to get to grips with, from a Bayesian point of view, well-known informal
reasoning fallacies, such as the argument from ignorance. Oaksford and Hahn
demonstrate that some of these so-called fallacies may be justified from a
Bayesian point of view.
In Chapter 12, Feeney takes a different perspective on the relationship
between induction and other forms of thinking. Whereas Rips and Asmuth
suggest that induction and deduction are different types of thinking, and
Oaksford and Hahn suggest that, generally speaking, there is only one type
of thinking, Feeney makes the argument that induction is no different from
deduction in that both generally require the operation of two different types
of thinking process. One of these is fast and associative whilst the other is
slow and symbol manipulating (see Evans & Over, 1996; Sloman, 1996). The
dual-process argument is not new, but its application here leads to a novel
conception of inductive reasoning that is often thought of as involving only
associative processes. Interestingly, however, causal knowledge is somewhat
problematic for this dual-process view.
Finally, in Chapter 13, Steven Sloman comments on the other chapters in
the book and draws together their common themes and implications in an
interesting and provocative way. We do not want to spoil his punchlines here,
but he has many interesting things to say about the relationship between types
of thinking and about Bayesian approaches to induction. In many ways, this
is the most important chapter in this book, as its role is to explicitly comment
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
Preface xix
on the state of the field rather than merely describing and interpreting what
is to be found in the literature.
We wish to thank the organisers of the Fifth International Conference on
Thinking for providing the forum for the meeting that led to this book. Darren
Dunning helped us prepare the manuscript for submission to the publisher.
Finally, we would like to thank all of the contributors to the symposium on
induction, including Dinos Hadjichristidis, Fenna Poletiek, and Vinod Goel,
and all of the contributors to this book.
References
Connolly, T., Arkes, H. R., & Hammond, K. R. (2000). Judgement and decision making:
An interdisciplinary reader. Cambridge, UK: Cambridge University Press.
Evans, J. St. B. T., & Over, D. E. (1996). Rationality and reasoning. Hove, UK: Psychology
Press.
Leighton, J. P., & Sternberg, R. J. (2004). The nature of reasoning. Cambridge, UK:
Cambridge University Press.
Manktelow, K., & Chung, M. C. (2004). Psychology of reasoning: Theoretical and historical
perspectives. Hove, UK: Psychology Press.
Rips, L. J. (2001). Two kinds of reasoning. Psychological Science, 12, 129–134.
Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological
Bulletin, 119, 3–22.
Editors
Aidan Feeney
Durham University
Evan Heit
University of California, Merced
P1: JZP
0521856485pre CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:19
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
Why study induction, and indeed, why should there be a whole book devoted
to the study of induction? The first reason is that inductive reasoning cor-
responds to probabilistic, uncertain, approximate reasoning, and as such, it
corresponds to everyday reasoning. On a daily basis we draw inferences such
as how a person will probably act, what the weather will probably be like, and
how a meal will probably taste, and these are typical inductive inferences. So
if researchers want to study a form of reasoning that is actually a pervasive
cognitive activity, then induction is of appropriate interest.
The second reason to study induction is that it is a multifaceted cognitive
activity. It can be studied by asking young children simple questions involving
cartoon pictures, or it can be studied by giving adults a variety of complex
verbal arguments and asking them to make probability judgments. Although
induction itself is uncertain by nature, there is still a rich, and interesting, set
of regularities associated with induction, and researchers are still discovering
new phenomena.
Third, induction is related to, and it could be argued is central to, a number
of other cognitive activities, including categorization, similarity judgment,
probability judgment, and decision making. For example, much of the study
of induction has been concerned with category-based induction, such as
inferring that your next door neighbor sleeps on the basis that your neighbor
is a human animal, even if you have never seen your neighbor sleeping.
And as will be seen, similarity and induction are very closely related, many
accounts of induction using similarity as their main currency (Heit & Hayes,
2005).
Finally, the study of induction has the potential to be theoretically revealing.
Because so much of people’s reasoning is actually inductive reasoning, and
because there is such a rich data set associated with induction, and because
induction is related to other central cognitive activities, it is possible to find
1
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
2 Evan Heit
out a lot about not only reasoning but cognition more generally by studying
induction.
Induction is traditionally contrasted with deduction, which is concerned
with drawing logically valid conclusions that must follow from a set of
premises. The following section will consider possible definitions of induction
by describing possible relations between induction and deduction. But first it
is useful to briefly mention that the reasons for studying induction to some
extent are linked to the differences between induction and deduction. That is,
it could be argued that induction, in comparison to deduction, characterizes
more of everyday reasoning, has the potential to be studied with a broader
range of tasks and materials, and is closely related to other cognitive activities
that help people manage uncertainty.
problems of deduction, and how inductive processes may differ (or not differ)
from deductive processes. These two views will now be addressed in turn.
4 Evan Heit
It should be clear that neither (1) nor (5) is deductively valid, yet some-
how (1) seems more plausible in terms of being a good inductive argument.
Whatever rules of logic are used to define deductive arguments may not be
too useful in determining that (1) is stronger than (5).
valid argument. On the other hand, an argument such as (2) above might
seem to have a very certain conclusion, perhaps 99.5% certain. This level of
certainty could still be well over the threshold that is required for saying that
an argument is deductively valid. Let’s say, hypothetically, that arguments
with conclusions below the 99% level of certainty will be called deductively
invalid. Even among these arguments, this version of the problem view allows
a great deal of differentiation. For example, argument (1) might be associated
with 80% certainty and argument (5) might be associated with 10% cer-
tainty. Hence (1) would be considered a much stronger inductive argument
in comparison to (5).
Perhaps the greatest appeal of this version of the problem view is that
it allows for deduction and induction to be placed on a common scale of
argument strength. In principle, any argument would have a place on this
scale, and whether it is deductively valid, inductively strong, or inductively
weak would be determined by the value on the scale. The most obvious
problem, though, is that there is still a need for assessing the place of each
argument on the scale. One nice idea might be an inductive logic, that is,
some set of rules or operations that for a set of premises can assign a certainty
value for a conclusion.
A subtler problem is that “certainty” itself would need to be defined better.
For example, in argument (1), either the conclusion that all mammals have
hearts is true or it is not, so the conversion from probability to certainty
may not be obvious. For example, it would seem a little funny to assign a
certainty level from 0% to 100% to a statement that is either true or false.
(Perhaps certainty could be related to the proportion of mammals with hearts,
rather than to whether it is true that all mammals have hearts.) Another issue
to clarify is the distinction between argument strength and certainty of the
conclusion. Argument (1) may seem strong simply because people believe
the conclusion that all mammals have hearts. Now compare that argument to
argument (7), below.
Here is a situation where the two arguments have the same conclusion,
which is equally certain in each case, but (1) seems much stronger than
(7). It could be valuable to consider other ways of representing argument
strength here, such as the conditional probability of the conclusion given
the premise, or the difference between the unconditional probability of
the conclusion and the conditional probability of the conclusion, given the
premise.
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
6 Evan Heit
8 Evan Heit
Criterion 2
DEDUCTIVELY DEDUCTIVELY
INVALID VALID
Minimum Maximum
Argument Argument
Strength Strength
INDUCTIVELY INDUCTIVELY
WEAK STRONG
Criterion 1
figure 1.1. Criterion-shift account of deduction and induction.
10 Evan Heit
first system, whereas deduction would depend more on the second system.
These two-process accounts have been used to explain a variety of findings
in reasoning, concerning individual differences, developmental patterns, and
relations between reasoning and processing time. For example, in Stanovich’s
work there is a rich account of how reasoning in the more deliberative system
is correlated with IQ, accounting for patterns of individual differences in a
variety of problems that would rely more on one system or the other.
but consistent arguments. Rips concluded that these results contradicted the
criterion-shift account, which predicted a monotonic ordering of arguments
in the two conditions. (See Heit & Rotello, 2005, for further examinations of
this kind, leading to the same conclusion.)
In sum, there is some evidence already that giving people deduction versus
induction instructions leads to qualitatively different results. It would seem
difficult to explain these results by assuming that deduction and induction
processes are essentially the same, except that deduction involves a stricter
criterion for giving a positive response. Yet at the same time, it seems too early
to abandon the one-process accounts, which do provide detailed and accurate
predictions about a range of phenomena, usually either concerning deductive
or inductive problems. In contrast, the two-process view does not seem to
be as well developed in terms of providing detailed process accounts and
predictions. More experimentation, directly aimed at comparing the one-
and two-process views and at further developing the two-process view, is
clearly needed.
At a more general level, the process view itself seems to be a rich and
worthwhile approach to studying induction. Certainly for psychologists, the
problem view does not seem viable. It is a mistake to assume that people
are performing deduction processes on designated deduction problems and
induction processes on designated induction problems. Likewise, even for
psychologists who are developing process level accounts of reasoning, it is
important to keep in mind the wide range of possible reasoning problems.
Assuming there is at least considerable overlap between deduction and in-
duction processes, an ideal theory of reasoning would not be limited to either
traditional problems of deduction or induction but would encompass both
types of problem.
12 Evan Heit
to accept the atomic theory over its principal rival, known as energeticism,
which conceived of matter as being continuous rather than being composed
of particles.
More generally, it should be possible to study inductive reasoning by study-
ing historical examples of reasoning, whether by scientists or others. One
advantage of studying scientific reasoning is that the evidence and theories
are usually explicit, in comparison to just studying everyday examples of rea-
soning. (See the previous book on induction, by Holland, Holyoak, Nisbett,
and Thagard, 1986, as well as a more recent collection by Gorman, Tweney,
Gooding, and Kincannon, 2005, for further examples.) There is an interest-
ing parallel between the historical approach to the study of induction and
the historical approach to the study of creativity (e.g., Weisberg, 1986). In
each case, it seems that much can be learned about cognition by looking at
paragon cases of thinking and reasoning, even outside the bounds of tightly
controlled psychological studies.
14 Evan Heit
recently, however, there have been several attempts to formalize the advan-
tage for following the diversity principle. As reviewed by Wayne (1995), there
have been two approaches. The first approach compares correlated sources
of evidence to independent sources of evidence, in statistical terms. For for-
mal treatments of this correlation approach, linking similarity to probability
theory, see Earman (1992) and Howson and Urbach (1993). The second ap-
proach is the eliminative approach. The idea behind the eliminative approach
is that diverse data sets will be particularly useful for eliminating plausible
but incorrect hypotheses, allowing stronger inferences to be drawn based on
the remaining, contending hypotheses. In contrast, non-diverse data sets will
likely be consistent with too many hypotheses to allow any strong inferences.
For a formal treatment of this approach, including a geometric proof, see
Horwich (1982), and see Heit (1998) and Tenenbaum and Griffiths (2001)
for some psychological applications.
Moreover, there have been arguments that following the diversity prin-
ciple is not normative. For example, using Earman’s (1992) derivations of
the diversity principle, Wayne (1995) showed that there can be exceptions,
namely, that non-diverse observations can lead to strong inferences if this
evidence is nonetheless very surprising. Wayne pointed to the case of the
near-simultaneous discovery in 1974 of a previously unknown subatomic
particle in two laboratories being a case of non-diverse evidence that still
had strong implications for revision of theories in physics. Lo, Sides, Rozelle,
and Osherson (2002) raised a related criticism of the normative status of the
diversity principle. They too argued that what is crucial is not diversity of ob-
servations but rather surprisingness of observations. Lo et al. also suggested
a set of exceptions, such as the following:
Squirrels can scratch through Bortex fabric in less than 10 seconds. (8)
Bears can scratch through Bortex fabric in less than 10 seconds.
---------------------------------------------------------------------
All forest mammals can scratch through Bortex fabric in less than 10 seconds.
Squirrels can scratch through Bortex fabric in less than 10 seconds. (9)
Mice can scratch through Bortex fabric in less than 10 seconds.
---------------------------------------------------------------------
All forest mammals can scratch through Bortex fabric in less than 10 seconds.
It seems intuitive that squirrels and bears are a more diverse pair than
squirrels and mice. Yet Lo et al. argued that (9) is stronger than (8), because
the evidence about squirrels and mice is more surprising than the evidence
about squirrels and bears. That is, the knowledge that small animals are less
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
capable of feats of strength than are large animals makes the evidence
about squirrels and mice more surprising than evidence about squirrels and
bears.
Heit, Hahn, and Feeney (2005) argued that these exceptions to the diver-
sity principle, suggested by Wayne (1995) and Lo et al. (2002), are indeed
exceptions, but they do not undermine the normative status of the diversity
principle itself. In the example of the discovery of a new subatomic particle
in 1974, physicists were influenced not only by diversity but also by many
other sources of knowledge in particle physics. In the example of scratching
through Bortex fabric, people would be influenced not only by diversity but
also by other knowledge about animals and their strength. In other words,
these exceptions as stated do not contain all the premises upon which the
arguments are based. Reasoning about these arguments is also influenced by
other hidden premises or background knowledge, so that diversity is not being
assessed in isolation. Therefore, these counterexamples do not invalidate the
diversity principle, because they are not pure tests of diversity. Rather, they
show that people will use other knowledge when possible. Indeed, philoso-
phers of science have not claimed that the diversity principle is the sole
principle for assessing evidence. For example, Popper (1963, p. 232) listed
diversity of supporting evidence as one of six criteria for assessing a scientific
theory.
In more general terms, it should be possible to consider a variety of patterns
in inductive reasoning in the light of the normative question, namely, what is
good reasoning. For example, one of the most pervasive findings in psycho-
logical research on induction is the similarity effect, namely, that arguments
such as (3) above concerning dogs and wolves are considered stronger than
arguments such as (10).
Dogs have hearts. (10)
-------------------------
Bees have hearts.
16 Evan Heit
The subjects judged arguments like (12) to be stronger than arguments like
(11), in response to the greater diversity of hippos and hamsters compared to
hippos and rhinos. Indeed, there is a great deal of evidence that adults, mainly
Western university students, follow the diversity principle when evaluating
written arguments (see Heit, 2000, for a review).
However, when looking to other subject populations, and to evidence col-
lected at a greater distance from the psychology lab, there seem to be exceptions
to the diversity principle as a descriptive account. In their study of Itzaj-
Mayan adults from the rainforests of Guatemala, Lopez, Atran, Coley, Medin,
and Smith (1997) did not find evidence for diversity-based reasoning, using
arguments with various living things and questions about disease transmis-
sion. Indeed, sometimes the Itzaj reliably chose arguments with non-diverse
premise categories over arguments with diverse categories. It appears that
they were using other knowledge about disease transmission that conflicted
with diversity-based reasoning. For example, given a non-diverse argument
that two similar kinds of tall palm trees get some disease, one person claimed
it would be easy for shorter trees, located below, to get the disease as well.
Giving further support to this idea that other strategies and knowledge can
overrule diversity, Proffitt, Coley, and Medin (2000) reported that American
adults who are tree experts (such as landscapers) did not show strong diversity
effects when reasoning about trees and their diseases. The tree experts seemed
to be relying on the knowledge that tree diseases tend to spread readily within
tree families such as elms and maples.
Medin, Coley, Storms, and Hayes (2003) reported further exceptions to the
diversity principle. One exception, referred to as non-diversity by property
reinforcement, potentially makes a direct challenge to the diversity principle
that is not as easily explained in terms of the use of other knowledge. The idea
behind non-diversity by property reinforcement is that two diverse categories
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
When given a forced choice between polar bears and antelopes versus polar
bears and penguins, subjects judged the two animals from the same biological
class, polar bears and antelopes, to be more similar than the two animals from
different biological classes, polar bears and penguins. However, when asked
to assess the inductive strength of each argument, argument (14) was judged
to be less convincing than argument (13). That is, argument (13) had less
diverse evidence, yet it was the stronger argument. Intuitively, although polar
bears and penguins are from different biological classes, they still share the
characteristic of living in a cold climate. It might seem than property X does
not apply to all animals but only to animals living in cold climates.
Heit and Feeney (2005) investigated the non-diversity by property rein-
forcement effect further and came to a somewhat different conclusion. Es-
sentially, their subjects judged polar bears and penguins to be more similar
than polar bears and antelopes. Hence, when argument (13) was judged to be
stronger than argument (14), Heit and Feeney’s subjects were showing a diver-
sity effect rather than a non-diversity effect. Heit and Feeney concluded that
the diversity effect was indeed robust, and their results suggest that exceptions
may be hard to show consistently.
18 Evan Heit
diverse animals, dogs and bees, have some biological property. The purpose
of this study was to see whether subjects reason that “if two such disparate
animals as dogs and bees” have this property then “all complex animals must”
(p. 141). Indeed, adults made broad inferences to all animals, extending the
property not only to things that were close to the premises (other mammals
and insects) but also to other members of the animal category (such as birds
and worms). In contrast, the children seemed to treat each premise separately;
they drew inferences to close matches such as other mammals and insects, but
they did not use the diversity information to draw a more general conclusion
about animals. Therefore in this first attempt there was evidence for effects
of diversity in adults but not children. In a follow-up study, Carey looked
at diversity effects based on the concept living thing rather than animal.
The key comparison was that children were taught a biological fact either
about dogs and bees or about dogs and flowers, with the latter being even
more diverse than the former. Given a fact about dogs and flowers, children
did tend to generalize fairly broadly, suggesting that children may have some
sensitivity to diversity of premise categories. However, if anything they tended
to overgeneralize, extending the property not only to other living things but
often to inanimate objects as well. Hence, there was suggestive evidence for
the impact of diversity of premise categories in this study, although children
did not show the same pattern as adults.
Continuing along this line of research that looks for diversity effects in
children, Lopez et al. (1992) found limited evidence for nine-year-olds and no
evidence for five-year-olds. For the five-year-olds, choices in a picture-based
task did not show any sensitivity to diversity of premise categories, even when
the diversity was explicitly mentioned by the experimenter. However, nine-
year-olds did show sensitivity to diversity of premises, but only for arguments
with a general conclusion category such as animal rather than a specific
conclusion category such as kangaroo. Gutheil and Gelman (1997) attempted
to find evidence of diversity-based reasoning for specific conclusions in nine-
year-olds, using category members at lower taxonomic levels which would
presumably enhance reasoning. However, like Lopez et al. (1992), Gutheil
and Gelman did not find diversity effects in nine-year-olds, although in a
control condition with adults, there was robust evidence for diversity effects.
More recently, however, Heit and Hahn (2001) reported diversity effects in
children younger than nine years in experiments that used pictures of people
and everyday objects as stimuli rather than animals with hidden properties.
For example, children were shown a diverse set of dolls (a china doll, a stuffed
doll, and a Cabbage Patch doll), all being played with by a girl named Jane.
Also children were shown a non-diverse set, three pictures of Barbie dolls,
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
being played with by Danielle. The critical test item was another kind of doll,
a baby doll, and the question was who would like to play with this doll. In
another stimulus condition, there was a diverse set of hats worn by one person,
and a non-diverse set worn by another person, and again, the critical question
was whether another hat would belong to the person with diverse hats or the
person with non-diverse hats. For 74% of these critical test items, children
age five to eight years made the diverse choice rather than the non-diverse
choice. It seems from the Heit and Hahn experiments that children can follow
the diversity principle at some level. However, it will take further work to
establish the critical differences that led the past studies to not find diversity
effects in children. (See Lo et al., 2002, for additional results, and Gelman,
2003, for further discussion.)
The second component of the model measures how well the premise cate-
gories cover the superordinate category that includes all the categories men-
tioned in an argument. For example, in arguments (11) and (12), the relevant
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
20 Evan Heit
features and connections. Although this model does have a notion of breadth
of features, there is no distinct component for assessing coverage of a super-
ordinate category, and indeed the model does not even rely on knowledge
about superordinate categories. Nonetheless, the Sloman model can account
for not only diversity effects but a variety of other phenomena.
heit (1998). The final model to be discussed is the Bayesian model by Heit
(1998). This model is linked to eliminative accounts of hypothesis testing and
as such is a normative model of how to reason with a hypothesis space. In
addition, this account is fairly successful as a descriptive account in the sense
that it predicts most of the same psychological phenomena as the Osherson
et al. (1990) and Sloman (1993) models. According to the Bayesian model,
evaluating an inductive argument is conceived of as learning about a property,
in particular learning for which categories the property is true or false. For
example, upon learning that dogs have some novel property X, the goal
would be to infer which other animals have property X and which do not. For
example, do wolves have the property and do parrots have the property? The
key assumption is that for a novel property such as in this example, people
would rely on prior knowledge about familiar properties to derive a set of
hypotheses about what the novel property may be like. People already know
a relatively large number of properties that are true of both dogs and wolves,
so if property X applies to dogs, then it probably applies to wolves too. On the
other hand, people know a relatively small number of properties that are true
of both dogs and parrots. Hence property X is relatively unlikely to extend to
parrots.
How does the Bayesian model explain diversity effects? In brief, diverse
categories bring to mind a very different set of hypotheses than non-diverse
categories. If hippos and hamsters have some novel property X in common,
one considers familiar properties that are shared by all mammals, such as
warm-bloodedness. Hence, property X too seems to extend broadly to all
mammals, assuming that it is distributed in a similar way as other properties
that are brought to mind. In contrast, if hippos and rhinos have property X in
common, it is easier to think of familiar properties that are shared by hippos
and rhinos but not most other mammals, such as being large and having a
tough skin. Property X, too, may be distributed in the same way, namely,
only to large, thick-skinned mammals, and seems less likely to extend to all
mammals.
In sum, this section has illustrated different modeling approaches to in-
duction, which have subsequently developed in later work. Interestingly,
these three models address a similar set of phenomena with different
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
22 Evan Heit
This chapter should at the very least illustrate the richness of research on
induction. Research on this topic might seem to face a lot of challenges. Af-
ter all, the degree of overlap with deduction has not yet been determined,
and some accounts of reasoning simply define induction in terms of not be-
ing deduction. By their very nature, inductive inferences do not have a “right
answer” in the same way as deductive inferences. Yet there are still regularities,
such as the diversity principle, which can be studied from a variety of per-
spectives, including historical, philosophical, experimental, developmental,
and computational. By no means is this regularity completely deterministic;
indeed, there are well-documented exceptions to the diversity principle that
are themselves of interest.
The material in this chapter should be seen as an invitation to consider
different approaches to induction and different phenomena in induction,
including those presented in the subsequent chapters of this book. All of
the chapters refer to the experimental approach to at least some extent. The
chapters by Hayes and by Medin and Waxman focus on the developmental
approach. The chapters by Tenenbaum and Blok, Osherson, and Medin largely
take the modeling approach. The chapters by Rips and Asmuth, and by
Thagard, involve the historical approach, and the chapters by Oaksford and
Hahn, and by Thagard, involve the philosophical approach. Finally, several
of the chapters (by Rehder, Rips & Asmuth, Oaksford & Hahn, Thagard, and
Feeney) directly address the question of what induction is.
References
Bacon, F. (1620–1898). Novum organum. London: George Bell and Sons.
Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: Bradford Books.
Carnap, R. (1950). Logical foundations of probability. Chicago: University of Chicago
Press.
Chater, N., & Oaksford, M. (2000). The rational analysis of mind and behavior. Synthese,
122, 93–131.
Earman, J. (1992). Bayes or bust? A critical examination of Bayesian confirmation theory.
Cambridge, MA: MIT Press.
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
Evans, J. St. B. T., & Over, D. E. (1996). Rationality and reasoning. Hove, UK: Psychology
Press.
Feeney, A., & Handley, S. J. (2000). The suppression of q-card selections: Evidence
for deductive inference in Wason’s selection task. Quarterly Journal of Experimental
Psychology, 53A, 1224–1242.
Gelman, S. A. (2003). The essential child: Origins of essentialism in everyday thought. New
York: Oxford University Press.
Goel, V., Gold, B., Kapur, S., & Houle, S. (1997). The seats of reason: A localization
study of deductive and inductive reasoning using PET (O15) blood flow technique.
NeuroReport, 8, 1305–1310.
Goodman, N. (1972). Problems and projects. Indianapolis: Bobbs-Merrill.
Gorman, M. E., Tweney, R. D., Gooding, D. C., & Kincannon, A. P. (Eds.), (2005).
Scientific and technological thinking. Mahwah, NJ: Lawrence Erlbaum Associates.
Gutheil, G., & Gelman, S. A. (1997). Children’s use of sample size and diversity in-
formation within basic-level categories. Journal of Experimental Child Psychology, 64,
159–174.
Harman, G. (1999). Reasoning, meaning, and mind. Oxford: Oxford University Press.
Heit, E. (1998). A Bayesian analysis of some forms of inductive reasoning. In M. Oaksford
& N. Chater (Eds.), Rational models of cognition, 248–274. Oxford: Oxford University
Press.
Heit, E. (2000). Properties of inductive reasoning. Psychonomic Bulletin & Review, 7,
569–592.
Heit, E., & Feeney, A. (2005). Relations between premise similarity and inductive
strength. Psychonomic Bulletin & Review, 12, 340–344.
Heit, E., & Hahn, U. (2001). Diversity-based reasoning in children. Cognitive Psychology,
43, 243–273.
Heit, E., Hahn, U., & Feeney, A. (2005). Defending diversity. In W. Ahn, R. Goldstone,
B. Love, A. Markman, & P. Wolff (Eds.), Categorization inside and outside of the
laboratory: Essays in honor of Douglas L. Medin, 87–99. Washington, DC: American
Psychological Association.
Heit, E., & Hayes, B. K. (2005). Relations among categorization, induction, recognition,
and similarity. Journal of Experimental Psychology: General, 134, 596–605.
Heit, E., & Rotello, C. M. (2005). Are there two kinds of reasoning? In Proceedings of
the Twenty-Seventh Annual Conference of the Cognitive Science Society. Hillsdale, NJ:
Lawrence Erlbaum Associates.
Heit, E., & Rubinstein, J. (1994). Similarity and property effects in inductive reason-
ing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 411–
422.
Hempel, C. G. (1966). Philosophy of natural science. Englewood Cliffs, NJ: Prentice Hall.
Holland, J. H., Holyoak, K. J., Nisbett, R. E., & Thagard, P. (1986). Induction: Processes
of inference, learning, and discovery. Cambridge, MA: MIT Press.
Horwich, P. (1982). Probability and evidence. Cambridge, UK: Cambridge University
Press.
Howson, C., & Urbach, P. (1993). Scientific reasoning: The Bayesian approach. Chicago:
Open Court.
Hume, D. (1777). An enquiry concerning human understanding. Oxford: Clarendon Press.
Johnson-Laird, P. (1983). Mental models. Cambridge, MA: Harvard University Press.
P1: JZP
0521672443c01 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:24
24 Evan Heit
From one perspective virtually every cognitive act carried out by young chil-
dren involves induction. Compared to older children and adults, young chil-
dren have little background knowledge about the world and have only a shaky
grasp of the rules that govern propositional reasoning. They are continually
faced with the task of making inferences based on their past observations
and experience. This is true whether the child is trying to determine which
kinds of animals have red blood cells, whether a specific tool can be used to
assemble a bike, or whether the new kid who has moved next door is likely
to be friendly. This chapter focuses on a particularly important aspect of the
development of inductive reasoning – the way that children use their under-
standing of categories and category structure to generalize attributes from
familiar to novel instances.
Category-based induction typically involves three components. First, ob-
serving that an inductive base or “premise” has the property P (e.g., that a
shark has fins); second, deciding that X and a target or “conclusion” item Y
are related in some way (e.g., that a shark and a trout are both fish); and third,
inferring whether Y shares property P.1 Investigating children’s inductive rea-
soning allows us to determine when this reasoning ability first emerges and
to chart important age-related changes in induction. It is also a valuable tool
for understanding the development of children’s category representations. In
fact, some important theories of conceptual development (e.g., Carey, 1985)
have been based, for the most part, on the results of studies of age changes in
children’s inductive generalizations.
This chapter will review and discuss work on the development of category-
based induction from infancy through early and later childhood. The first part
1 Throughout this chapter the terms “base” and “premise,” and “target” and “conclusions” are
considered synonymous and will be used interchangeably.
25
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
26 Brett K. Hayes
of the chapter will consider just what kinds of information infants and children
use to decide whether the properties of one category generalize to other
categories. To preface the main conclusions it will be argued that, by five years
of age, children seem aware of a number of different ways in which two or more
categories can be related, and they can use these different kinds of relations
as a basis for inductive inference. In addition to relatively obvious relations
such as the perceptual similarity of the inductive base and target, young
children use taxonomic, hierarchical, and causal principles to guide property
inferences. They are also sensitive to the fact that inferences often depend on
the kind of property being considered. More complex heuristics that require
the integration of information across multiple instances or premises (e.g., the
premise diversity principle described in Chapter 1) may emerge somewhat
later in development. However, even here recent evidence suggests that school-
age children are better at using these principles than was once thought.
The second part of the chapter will examine the implications of this de-
velopmental research for general theories of induction, highlighting the ways
that the developmental data can be used to evaluate and constrain competing
theories. The third and final part will consider the best way of explaining de-
velopmental change in induction. Again, to preface the main argument, the
current evidence suggests that many of the fundamental processes involved
in inductive reasoning are developmentally stable from the preschool years
and that the most significant age-related changes in induction may be due to
the growth of more sophisticated knowledge about objects, object properties,
and category structures.
Perceptual Similarity
Classic theories of cognitive development emphasized the role played by
perceptual similarity in determining early categorization and induction
(e.g., Bruner, Olver, & Greenfield, 1966; Inhelder & Piaget, 1958; Vygotsky,
1934/1986). Subsequent work has confirmed that the influence of perceptual
similarity on property inferences emerges at an early point in development.
Baldwin, Markman, and Melartin (1993), for example, allowed infants aged
between nine and sixteen months to inspect an object (the inductive base) that
had an unusual and unexpected property (e.g., it wailed when it was shaken
or tilted). Infants were then shown other toys that varied in their perceptual
similarity to the base. In the critical test condition these were modified so
that they could not generate the unexpected sound. Nevertheless, even the
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
Taxonomic Relations
One of the most consistent themes in this book is that people are sensitive
to the fact that the members of closely related categories are likely to share
novel, non-obvious properties. It is important therefore to establish just when
and how children first make use of shared taxonomic relations as a basis for
inductive inference. The most common approach to studying this issue has
been to examine whether infants and children are more likely to generalize
a novel property from an inductive base to target when these are given the
same noun label.
studies with infants. Welder and Graham (2001) examined the effects of
verbal labels and perceptual similarity on induction by sixteen- to twenty-one-
month-old infants. Infants were shown a base object with a novel property
(e.g., a cloth-covered toy that squeaked when squeezed). This was presented
together with either a novel noun label (e.g., “Look at this blint”) or without a
label (e.g., “Look at this one”). Infants were then shown test objects that varied
in their similarity to the shape of the base, with the label present or absent.
When there was no label infants relied on shape similarity in generalizing the
property. In the shared label conditions generalization of the novel property
depended on both shared labels and shape. This same pattern of inference
has been shown with thirteen-month-olds who are just beginning to acquire
productive language (Graham et al., 2004).
Many have interpreted such findings as evidence that infants understand
that noun labels supply information about category membership and that
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
28 Brett K. Hayes
studies with preschoolers and older children. One of the first sug-
gestions that children look beyond surface similarity when doing induction
came from the work of Carey (1985). Children aged from four to ten years,
and adults, were shown a picture of a base item such as a person or other ani-
mal and were told that it had an unobservable novel property (e.g., “a spleen
inside it”). Carey then presented a range of other objects, including dogs,
bees, and flowers, as well as inanimate kinds like rocks, and asked people to
decide which ones shared the novel property. Although children’s inferences
were often influenced by the similarity between the base and target items,
Carey also found some important exceptions. For example, young children
were more likely to generalize a biological property of humans to “bugs” than
from bees to bugs. Carey explained this finding by arguing that humans were
regarded by young children as a kind of ideal or prototypical animal, so that
their properties were highly likely to be shared by other animals (see Chap-
ter 3 by Medin and Waxman for further discussion).
In a series of studies spanning two decades, Susan Gelman and her col-
leagues have tried to systematically separate out the effects of taxonomic and
perceptual factors on young children’s induction (see Gelman, 2003, for a
review). Gelman and Markman (1986), for example, used a triad task in
which four-year-olds were taught a novel property about a base picture (e.g.,
a tropical fish) with a familiar category label (“fish”) and shown two target
pictures. One of these was perceptually similar to the base but had a different
label (e.g., dolphin). The other looked less like the base but had the same
label (e.g., a picture of a shark labelled as a “fish”). Children were asked to
choose which of the targets was more likely to have the novel property. The
majority of four-year-olds chose the taxonomic rather than the perceptual
match. Subsequent work has found the same pattern of inference in children
as young as two years of age (Gelman & Coley, 1990). Preschoolers’ preference
for inferences based on category labels over perceptual appearances has been
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
found with realistic photographs as well as line drawings of animals (Deák &
Bauer, 1996; Florian, 1994), and extends to artifact (Gelman, 1988) and social
categories (e.g., Diesendruck, 2003; Heyman & Gelman, 2000a, b).
As we saw earlier there are at least two ways to interpret such findings. First,
labels can have an indirect effect on inductive inferences by serving as a cue
that the base and target instances belong to the same category. Alternatively,
common labels could have a direct effect by serving as identical auditory
features making the premise and conclusion items more perceptually similar.
A series of control studies support the argument that young children use
shared category membership rather than just the perceptual aspects of shared
labels as the basis for their inferences. Preschool children reliably gener-
alize novel properties from premise to conclusion items with perceptually
dissimilar but synonymous labels (e.g., “bunny” and “rabbit”) (Gelman &
Markman, 1986). Critically, however, preschool children do not use shared
labels as a basis for induction when these labels lack information about cat-
egory membership. Gelman and Coley (1990), for example, found that the
presence of a shared label referring to a transient state like “is sleepy” did not
promote the generalization of biological properties. Similarly, young children
do not use shared proper names like “Anna” as a basis for generalizing be-
havioural traits, presumably because they recognize that such names refer to
individuals rather than members of a social category (Heyman & Gelman,
2000a). Young children do not generalize properties on the basis of shared
labels when these conflict with existing category knowledge (e.g., when catlike
animals are labelled as “dogs”) (Jaswal, 2004), or when they cross superordi-
nate category boundaries (e.g., when the same label is given to animals and
artifacts) (Davidson & Gelman, 1990).
In some cases labels may not be necessary for preschool children to gen-
eralize properties along taxonomic lines. Deák and Bauer (1996) presented
young children with realistic photographs of premise objects (e.g., a panther)
and conclusion alternatives from the same basic-level categories (e.g., a Tabby
house cat) or a different category (e.g., a black horse). The different-category
photo was always more similar in appearance to the premise than the same-
category item. Even when no labels were given, children were more likely
to generalize novel properties based on shared category membership than
on overall perceptual appearance. Children used the information contained
in the photographs to infer that the panther and cat belonged to the same
biological category, and they used this relationship rather than appearance as
a basis for property induction.
A final line of evidence suggesting that young children use taxonomic re-
lations in induction is their understanding that typical category members
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
30 Brett K. Hayes
2 It should be noted that López et al. (1992) offer an alternative explanation of typicality effects,
arguing that they can arise through an assessment of the similarity of the features of premise
and conclusion items, without the activation of categorical information.
3 A little caution is required in drawing implications about the development of property inference
from studies using a name extension paradigm. Adults and children usually distinguish names
from other object properties and may generalize each in different ways (Markman & Ross, 2003).
Nevertheless, the cited studies establish that infants understand the categorical implications of
shared labels.
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
Hierarchical Relations
Taxonomic knowledge also entails an understanding that objects are embed-
ded in hierarchies based on class inclusion relations. Daffodils are understood
to be a kind of flower which is, in turn, a living thing. Like older children
and adults, preschool children are most likely to generalize a novel property
from a category to other items at the same hierarchical level, and they show
decreasing generalization to test items located at more abstract levels (Carey,
1985; Gelman & O’Reilly, 1988; Inagaki & Hatano, 1996).
An important finding in other conceptual tasks like categorization and
naming is that certain hierarchical levels are psychologically basic or “privi-
leged” (Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976). For example,
children usually acquire the names of objects at intermediate levels of a hi-
erarchy (e.g., apple) before they learn the names for things located at a more
subordinate (e.g., Delicious apple) or superordinate level (e.g., fruit). In the
case of induction the privileged level can be defined as the highest level in a
hierarchy where there is a strong belief that the properties of objects at that
level can be generalized to an immediate superordinate.
Coley, Hayes, Lawson, and Moloney (2004) examined the location of the
privileged level within a biological hierarchy in five-year-olds, eight-year-olds,
and adults. All groups were given two kinds of conceptual tasks; feature listing
and inductive inference. In the first task participants listed the features they
thought were shared by the members of each of four hierarchical levels (e.g.,
animals, birds, sparrows, house sparrows). In the induction task participants
were taught a novel property about an instance from a given hierarchical level
and asked whether this generalized to categories at the immediate superor-
dinate level (e.g., from house sparrows to sparrows, from sparrows to birds,
from birds to animals). The key finding was that, for eight-year-olds and
adults, the location of the privileged level often differed across the two tasks.
In feature listing the highest level in the hierarchy where objects were believed
to share many features was the life-form (e.g., bird, fish, tree). For adults and,
in many cases, for eight-year-olds, the privileged level for induction was the
more specific folk-generic level (e.g., sparrow, maple). Adults and eight-year-
olds generalized novel properties from very specific biological kinds such as
house sparrow and sugar maple to sparrows and maples, respectively, but
they were less likely to generalize from sparrows to birds or maples to trees.
Five-year-olds showed a different pattern with the life-form level privileged
for both feature listing and induction in the majority of biological categories.
The task discrepancies in the location of the basic level show that by eight
years of age children often rely on different sorts of information for feature
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
32 Brett K. Hayes
listing and induction. Feature listing reflects knowledge about category mem-
bers. Adults and children knew more about the kinds of features shared by the
members of life-form categories than they did about more specific categories.
But their inferences reflected an expectation that features would cluster at the
folk-generic level. One source of such expectations is the nomenclature for
members of subordinate categories. In English (and most other languages)
levels below the folk-generic are marked as subtypes by the use of compound
names (e.g., desert oak). Coley et al. (2004) show that by eight years of age
children recognize the implications of such subordination for inductive in-
ference. An understanding of linguistic hierarchies alone, however, is unlikely
to explain the emergence of a privileged level for induction. In a related study
with adults, Coley, Medin, and Atran (1997) found that the privileged level
for induction remained at the folk-generic level even when category labels
contained a reference to the life-form level (e.g., goldfish, catfish). Expecta-
tions about feature clusters at the folk-generic level therefore may also reflect a
more abstract belief that category members at this level share some underlying
biological property that gives rise to those features.
More research is needed to clarify the factors that lead certain hierarchical
levels to be privileged in children’s induction. Nevertheless, the findings of
Coley et al. (2004) highlight the importance of children’s beliefs about category
hierarchies in induction and indicate that these beliefs involve more than
just an appreciation of the common and distinctive features of instances at
different hierarchical levels.
34 Brett K. Hayes
difficulties with the performance aspects of the tasks used by Lopez et al.
(1992) and Gutheil and Gelman (1997). Heit and Hahn (2001) suggest that
young children may have difficulty in applying principles like diversity and
monotonicity to hidden or invisible properties (e.g., “has leucocytes inside”).
Moreover, Lopez et al. (1992) required children to respond to a variety of
inductive arguments (e.g., typicality, similarity, monotonicity, and diversity)
administered in the same testing session. To perform well children would have
to not only understand each kind of argument, but also switch between induc-
tive strategies. Heit and Hahn (2001) tried to overcome these performance
barriers by examining five- and nine-year-olds’ generalization of concrete,
observable properties from more or less diverse premise sets. For example,
children were shown photographs of a diverse set of three dolls and told that
they belonged to a girl named Jane. They were also shown a set of three very
similar-looking dolls and told that they belonged to a girl named Danielle.
They were then shown a new doll and asked who was the most likely owner.
For observable properties, even five-year-olds reliably based their inferences
on the diverse rather than the non-diverse premises. When hidden properties
such as “has nuts inside” were used, only the nine-year-olds were sensitive to
premise diversity.
In a similar vein Hayes and Kahn (2005) re-examined sensitivity to premise
monotonicity in five- and nine-year-olds using items that always involved a
choice between inferences based on a small or a large premise set. For example,
children were presented with a small set of two grey turtles which shared a
property such as “eats snails” or a larger set of six brown turtles that shared a
different property such as “eats worms.” The premises were presented with the
cover story that they were samples from a larger inclusive category (i.e., both
kinds of turtles came from the same zoo). Children were then asked to decide
which property was most likely to be found in either another animal at the
same hierarchical level (e.g., a turtle whose colour was unknown) or in most
of the other animals of the same type at the zoo. Nine-year-olds were more
likely to generalize the property of the larger set to both specific and general
types of conclusions. Younger children did so only for the general conclusions
(see Lo et al., 2002, for additional positive findings regarding children’s use of
monotonicity).
These data suggest that, by five years of age, children are able to use inductive
heuristics like premise monotonicity and diversity. Exactly how they do this
remains to be explained. One interpretation is that an ability to compute
coverage develops by around five years of age. On the other hand it could
be that Osherson et al.’s assumptions about how people process multiple
premises in diversity and monotonicity tasks need to be revised. If young
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
Property Knowledge
In all of the cases considered so far the focus has been on how children use
various kinds of relations between premise and conclusion categories to draw
inferences about properties. Unlike most of the items used in experimental
induction tasks, however, most cases of everyday induction involve some
knowledge of the properties to be inferred, as well as the relevant categories.
Whether a property is generalized from one bird to another depends on
whether the property is an internal organ, nesting pattern, or type of birdsong.
There is plenty of evidence that adult inductive inference varies as a function of
both knowledge about the categories and the predicate in question (e.g., Heit
& Rubinstein, 1994; Nisbett, Krantz, Jepson, & Kunda, 1983). The evidence
reviewed below suggests that preschool children, and possibly younger infants,
are also sensitive to category-property interactions.
36 Brett K. Hayes
properties such as “likes to play tibbits” but not transient properties such as
“feeling thirsty” (Heyman & Gelman, 2000a).
Children also appear sensitive to certain kinds of specific property-category
relations. In an ingenious study, Kalish and Gelman (1992) presented three-
and four-year-olds with novel objects that had two labels, one that referred to
object kind and one that described the material that the object was made from.
In the critical experimental condition these labels supported different kinds
of inferences. For example, whether “glass scissors” are judged to be fragile
depends on which of the two labels is seen as more relevant to the judgment.
Kalish and Gelman found that young children were sensitive to the kinds of
labels that were relevant to generalizing different sorts of properties. When
asked whether objects would share a novel dispositional property (e.g., “will
fracture in water”), young children based their judgments on the material
label. When asked whether they shared a novel functional property (e.g.,
“used for partitioning”), children used the object label.
Kelemen, Widdowson, Posner, Brown, and Casler (2003) have also shown
that the direction of children’s inferences within biological categories depends
on the kind of property that is being generalized. Three- and four-year-
olds based inferences about which animals had similar behavioural features
(e.g., “fights off dangerous animals”) on shared body parts relevant to this
behaviour (e.g., having horns) rather than on overall perceptual similarity.
This pattern reversed when children were generalizing a novel category label
(cf. McCarrell & Callanan, 1995).
With age there appears to be an increasing appreciation of the role of
category-property interactions in induction. Gelman (1988) presented four-
and eight-year-olds with natural kinds such as flowers and artifacts such
as chairs. Children were asked to decide whether the properties of these
objects generalized to other natural kinds and artifacts. Some children were
asked about properties which (from an adult perspective) were thought to
be most generalizable within the domain of natural kinds (e.g., “needs CO2
to grow”), while others were asked about properties thought to be more
generalizable for artifacts (e.g., “you can loll with it”). Control conditions
also examined properties thought to be generalizable to neither domain or
to be equally generalizable to both. The older children were sensitive to the
relations between categories and properties, being more likely to generalize
“natural kind properties” to natural kind exemplars than to artifacts, and vice
versa. Four-year-olds, however, did not discriminate between the properties
typically associated with the two domains.
One way of interpreting the developmental trends described by Gelman
(1988) is that sensitivity to category-property relations in induction increases
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
38 Brett K. Hayes
as children learn more about the causal principles that operate in different
conceptual domains and the specific properties associated with exemplars
from each domain. Although a direct test of this developmental hypothesis
has yet to be carried out, indirect support comes from studies of the effects
of expertise on adult reasoning. Shafto and Coley (2003) found that fish
experts used causal and ecological relations in generalizing familiar properties
like “has disease X.” When generalizing properties about which they knew
little (e.g., “has sarca”), both experts and novices relied on similarity-based
strategies. Hence, increasing domain knowledge is associated with greater use
of inductive reasoning based on the interactions between specific categories
and properties.
Causal Relations
The previous section showed that children’s background knowledge of both
categories and properties plays an important role in their inductive inferences.
Exactly how does such knowledge interact with more “domain-general” prin-
ciples such as premise-conclusion similarity, monotonicity, diversity, and so
on? Recent studies with adults suggest that when people have some knowl-
edge about the relations between premises this can override more general
inductive principles. Medin, Coley, Storms, and Hayes (2003) showed that
diversity-based reasoning is undermined when people perceive a salient or
distinctive relation between the premises that is unlikely to be shared by a
conclusion category. Adults, for example, judged that a novel property shared
by fleas and butterflies was more likely to generalize to sparrows than a prop-
erty shared by dogs and fleas, even though they saw the second set of premises
to be more diverse. In other words, they were aware of the ecological relations
between dogs and fleas and judged that this special relation was unlikely to
generalize to other animals (but see Heit and Feeney, 2005, for a different
interpretation).
An important goal for studies of the development of inductive reasoning
is to specify how children integrate their causal knowledge with an under-
standing of more general inductive principles, and to examine whether this
process changes as children develop more knowledge of the world. As a first
step in this direction Thompson and Hayes (2005) devised a task in which
five-year-olds, nine-year-olds, and adults were forced to choose between in-
ductive inferences based on shared causal relations or featural similarity (cf.
Lassaline, 1996). The basic procedure is illustrated in Figure 2.1. Attribute base
items were fictional animals made up of three features that were not linked via
explicit causal relations but which could plausibly co-exist. Causal base items
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
Target
The following facts are true of Animal C
Induction Test
figure 2.1. Schematic stimulus example from Thompson and Hayes (2005).
contained two features that could be linked by causal relations, together with
a statement of the direction of causation. All features were illustrated with
line drawings. After presentation of the bases, a target instance was described.
As the table shows, this always had a higher featural similarity to the attribute
base but contained the antecedent feature from the causal base. Participants
had to predict which of two other properties was most likely to be found in
the target, the third feature of the attribute base or the consequent feature
from the causal base.
The main results are summarized in Figure 2.2. The proportion of “causal”
choices at test exceeded chance for each age group. There was also a clear
increase in the proportion of causal choices with age. When children’s justifi-
cations for their choices were considered, there was even more robust evidence
for the early development of an awareness of the role of causal relations in
induction. Five-year-olds mentioned the causal relation in over forty percent
of justifications. By comparison, explicit appeals to feature similarity as a basis
for induction were relatively rare for all age groups. This shows that causal
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
40 Brett K. Hayes
0.9
0.8
Proportion of causal choices
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
5 year olds 9 year olds Adults
Age
figure 2.2. Mean proportion of causal choices on the Thompson and Hayes (2005)
causal induction task.
relations between the features of premise and conclusion items can override
featural similarity as a basis for induction, even in young children. It remains
to be shown whether these results generalize to other category domains like
artifacts and social kinds. Further work is also needed to establish just which
aspects of causal relationships lead children to make stronger inductions.
table 2.1. Summary of major category-based induction phenomena in infants and children
similarity (Graham et al., 2004; Welder & Graham, (Deak & Bauer, 1996; Florian, 1994; Gelman, 1988;
2001) Gelman & Coley, 1990; Gelman & Markman, 1986,
1987; Heyman & Gelman, 2000a, b; Sloutsky &
Fisher, 2004)
Similarity overrides shared labels
42
r When labels conflict with existing —– Yes
knowledge (Davidson & Gelman, 1990; Jaswal, 2004)
r When adjective labels are —– Yes
used (Gelman & Coley, 1990; Gelman & Heyman, 1999)
r When labels are proper names —– Yes
(Heyman & Gelman, 2000a)
Conclusion homogeneity —– Yes
(Carey, 1985; López et al., 1992)
Premise Typicality —– Yes
(Lo et al., 2002; López et al., 1992)
Premise diversity: Specific conclusions —– Yes
(Heit & Hahn, 2001)
July 19, 2007
14:27
P1: JZP
0521672443c02
43
r Transient vs. entrenched — Yes
properties (Gelman, 1988; Gelman & Coley, 1990)
r Property generalization depends on the (Yes) Yes
category domain (Mandler & McDonough, 1996, 1998) (Carey, 1985; Gelman, 1988; Heyman & Gelman,
2000b; Kalish & Gelman, 1992)
Shared functional features promote Yes Yes
inference (McCarrell & Callanan, 1995; Rakison & (Kelemen et al., 2003)
Hahn, 2004)
Shared causal relations promote inference —– Yes
(Thompson & Hayes, 2005)
Note: Yes = Robust evidence for the phenomenon; (Yes) = Some positive evidence but some controversy still exists; No = Robust evidence against the phenomenon;
— = No studies with the relevant age group.
July 19, 2007
14:27
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
44 Brett K. Hayes
Bayesian Models
Heit (1998) proposed a Bayesian model of induction such that generalization
of a novel property from a premise to a conclusion category depends on
prior knowledge about known properties shared by these categories. This
knowledge is used to generate hypotheses about the projection of the novel
property (e.g., whether it is shared by both the premises and conclusion
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
46 Brett K. Hayes
Relevance Theory
With a view to explaining inductive phenomena in which people have varying
levels of knowledge about inductive predicates, Medin et al. (2003) proposed
a relevance framework such that people assume that the set of premises
provided in inductive arguments are informative and will search for distinc-
tive relations between them. For example, if told that koalas, kangaroos, and
wallabies all shared some property, you might infer that the property had
something to do with being an Australian animal (or more specifically, an
Australian marsupial). Moreover, the activation of this distinctive relation
can take place before considering the similarity between the premises and
conclusions. If the conclusion category shared the distinctive relation (e.g.,
if it was another Australian animal), then the novel property is likely to be
generalized. According to this approach, causal relations between premises
are highly distinctive and therefore likely to influence induction.
A detailed process-model of relevance has yet to be described, so it is
difficult to compare this approach with more explicit models like SCM and
FBI. In general terms, however, this approach can explain why features that are
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
48 Brett K. Hayes
kinds of information (e.g., causal relations) being used more frequently and
across a wider range of domains.
This final part of the chapter addresses two issues relating to this develop-
mental pattern. First, how do children first acquire an understanding of the
relations that can be used as a basis for induction? Second, where there are
age-related changes in induction, can these simply be explained by the gradual
expansion of children’s background knowledge about categories, exemplars,
and exemplar properties?
4 This of course begs the question as to the origins of essentialist biases or beliefs. See Gelman
(2003) for a review of possible answers.
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
property to other category members. They also show that, with appropriate
training, four-year-olds can readily learn to generalize in the same way as
older children. Sloutsky and Fisher argue that their results show that young
children do not have an a priori assumption that category members share
hidden properties. However, the rapidity with which children learned to
generalize along categorical lines could also be seen as consistent with a bias
which favours category-based induction.
So far “bottom-up” models of category acquisition and induction have
been applied to only a small part of the developmental induction data in
Table 2.1. They can account for children’s increasing reliance on labels as a
basis for induction over time. However, it is unclear how they would explain
why different kinds of labels have different effects on inductive inference, and
why structural and functional features sometimes override labels as a basis for
induction. Such models also offer no principled account of property-category
interactions. It is too early to tell, therefore, whether these approaches will
succeed without the incorporation of additional constraints like an essentialist
bias. Nevertheless, the tension between these approaches has had a positive
effect on the field by stimulating the development of more explicit models of
the mechanisms involved in learning which object features are most important
in promoting induction.
50 Brett K. Hayes
I would like to thank Susan Thompson and Tamara Cavenett for their valuable
comments and assistance in the preparation of this chapter. The research
described in the chapter was supported by Australian Research Council Grant
DP0344436.
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
References
Baldwin, D. A., Markman, E. M., & Melartin, R. L. (1993). Infants’ ability to draw
inferences about nonobvious object properties: Evidence from exploratory play. Child
Development, 64, 711–728.
Booth, A. E., Waxman, S. R., & Huang, Y. T. (2005). Conceptual information permeates
word learning in infancy. Developmental Psychology, 41, 491–505.
Bruner, J. S., Olver, R. R., & Greenfield, P. M. (1966). Studies in cognitive growth. New
York: Wiley.
Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: MIT Press, Bradford
Books.
Coley, J. D., Hayes, B., Lawson, C., & Moloney, M. (2004). Knowledge, expectations, and
inductive reasoning within conceptual hierarchies. Cognition, 90, 217–253.
Coley, J. D., Medin, D. L., & Atran, S. (1997). Does rank have its privilege? Inductive
inferences within folkbiological taxonomies. Cognition, 64, 72–112.
Davidson, N. S., & Gelman, S. A. (1990). Inductions from novel categories: The role of
language and conceptual structure. Cognitive Development, 5, 151–176.
Deák, G. O., & Bauer, P. (1996). The dynamics of preschoolers’ categorization choices.
Child Development, 67, 740–767.
Diesendruck, G. (2003). Categories for names or names for categories? The interplay
between domain-specific conceptual structure and language. Language and Cognitive
Processes, 18, 759–787.
Farrar, M. J., Raney, G. E., & Boyer, M. E. (1992). Knowledge, concepts, and inferences
in childhood. Child Development, 63, 673–691.
Florian, J. E. (1994). Stripes do not a zebra make, or do they? Conceptual and perceptual
information in inductive inference. Developmental Psychology, 30, 88–101.
Gelman, S. A. (1988). The development of induction within natural kind and artifact
categories. Cognitive Psychology, 20, 65–95.
Gelman, S. A. (2003). The essential child: Origins of essentialism in everyday thought.
Oxford: Oxford University Press.
Gelman, S. A., & Coley, J. D. (1990). The importance of knowing a dodo is a bird:
Categories and inferences in 2-year-old children. Developmental Psychology, 26, 796–
804.
Gelman, S. A., & Markman, E. M. (1986). Categories and induction in young children.
Cognition, 23, 183–209.
Gelman, S.A., & Markman, E. M. (1987). Young children’s inductions from natural
kinds: The role of categories and appearances. Child Development, 8, 157–167.
Gelman, S. A., & O’Reilly, A. W. (1988). Children’s inductive inferences within super-
ordinate categories: The role of language and category structure. Child Development,
59, 876–887.
Gentner, D. (1988). Metaphor as structure mapping: The relational shift. Child Devel-
opment, 59, 47–59.
Goldstone, R. L. (1994). The role of similarity in categorization: Providing a groundwork.
Cognition, 52, 125–157.
Goswami, U. (2002). Inductive and deductive reasoning. In U. Goswami (Ed.), Blackwell
handbook of cognitive development (pp. 282–302). Oxford, UK: Blackwell.
Graham, S. A., Kilbreath, C. S., & Welder, A. N. (2004). 13-month-olds rely on shared
labels and shape similarity for inductive inferences. Child Development, 75, 409–427.
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
52 Brett K. Hayes
Gutheil, G., & Gelman, S. A. (1997). Children’s use of sample size and diversity in-
formation within basic-level categories. Journal of Experimental Child Psychology, 64,
159–174.
Halford, G., & Andrews, G. (2004). The development of deductive reasoning: How
important is complexity? Thinking and Reasoning, 23, 123–145.
Hayes, B. K., & Kahn, T. (2005). Children’s sensitivity to sample size in inductive rea-
soning. Paper presented at the 14th Biennial Conference of the Australasian Human
Development Association, Perth, July.
Heit, E. (1998). A Bayesian analysis of some forms of inductive reasoning. In M. Oaksford
& N. Chater (Eds.), Rational models of cognition (pp. 248–274). Oxford: Oxford
University Press.
Heit, E., & Feeney, A. (2005). Relations between premise similarity and inductive
strength. Psychonomic Bulletin & Review, 12, 340–344.
Heit, E., & Hahn, U. (2001). Diversity-based reasoning in children. Cognitive Psychology,
47, 243–273.
Heit, E., & Hayes, B. K. (2005). Relations between categorization, induction, recognition
and similarity. Journal of Experimental Psychology: General, 134, 596–605.
Heit, E., & Rubinstein, J. (1994). Similarity and property effects in inductive reasoning.
Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 411–422.
Henderson, A. M., & Graham, S. A. (2005). Two-year-olds’ appreciation of the shared
nature of novel object labels. Journal of Cognition and Development, 6, 381–402.
Heyman, G. D., & Gelman, S. A. (2000a). Preschool children’s use of trait labels to make
inductive inferences. Journal of Experimental Child Psychology, 77, 1–19.
Heyman, G. D., & Gelman, S. A. (2000b). Preschool children’s use of novel predicates to
make inductive inferences about people. Cognitive Development, 15, 263–280.
Inagaki, K., & Hatano, G. (1996). Young children’s recognition of the commonalities
between animals and plants. Child Development, 67, 2823–2840.
Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adoles-
cence. New York: Basic Books.
Jaswal, V. (2004). Don’t believe everything you hear: Preschoolers’ sensitivity to speaker
intent in category induction. Child Development, 75, 1871–1885.
Jones, S. S., & Smith, L. S. (1993). The place of perception in children’s concepts. Cognitive
Development, 8, 113–139.
Kalish, C. W., & Gelman, S. A. (1992). On wooden pillows: Multiple classifications and
children’s category-based inductions. Child Development, 63, 1536–1557.
Keil, F. (1991). The emergence of theoretical beliefs as constraints on concepts. In S. Carey
& R. Gelman (Eds.), The epigenesis of mind (pp. 133–169). Hillsdale, NJ: Erlbaum.
Kelemen, D., Widdowson, D., Posner, T., Brown, A. L., & Casler, K. (2003). Teleo-
functional constraints on preschool children’s reasoning about living things. Devel-
opmental Science, 6, 329–345.
Lassaline, M. E. (1996). Structural alignment in induction and similarity. Journal of
Experimental Psychology: General, 22, 754–770.
Lo, Y., Sides, A., Rozelle, J., & Osherson, D. (2002). Evidential diversity and premise
probability in young children’s inductive judgment. Cognitive Science, 26, 181–206.
Loose, J. J., & Mareschal, D. (1999). Inductive reasoning revisited: Children’s reliance on
category labels and appearances. In Proceedings of the Twenty-First Annual Conference
of the Cognitive Science Society (pp. 320–325). Mahwah, NJ: Erlbaum.
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
López, A., Gelman, S. A., Gutheil, G., & Smith, E. E. (1992). The development of
category-based induction. Child Development, 63, 1070–1090.
Madole, K. L., & Oakes, L. (1999). Making sense of infant categorization: Stable processes
and changing representations. Developmental Review, 19, 263–296.
Mandler, J. M., & McDonough, L. (1996). Drinking and driving don’t mix: Inductive
generalization in infancy. Cognition, 59, 307–335.
Mandler, J. M., & McDonough, L. (1998). Studies in inductive inference in infancy.
Cognitive Psychology, 37, 60–96.
Markman, A. B., & Ross, B. (2003). Category use and category learning. Psychological
Bulletin, 129, 592–613.
Markman, E. M., & Callanan, M. A. (1984). An analysis of hierarchical classification.
In R. Sternberg (Ed.), Advances in the psychology of human intelligence, Vol. 2 (pp.
325–366). Hillsdale, NJ: Erlbaum.
McCarrell, N. S., & Callanan, M. A. (1995). Form-function correspondences in children’s
inference. Child Development, 66, 532–546.
McClelland, J. L., & Rogers, T. T. (2003). The parallel distributed processing approach
to semantic cognition. Nature Reviews Neuroscience, 4(4), 310–322.
McDonald, J., Samuels, M., & Rispoli, J. (1996). A hypothesis assessment model of
categorical argument strength. Cognition, 59, 199–217.
Medin, D. L., Coley, J., Storms, G., & Hayes, B. K. (2003). A relevance theory of induction.
Psychonomic Bulletin & Review, 10, 517–532.
Murphy, G. L. (2004). On the conceptual-perceptual divide in early concepts. Develop-
mental Science, 7, 513–515.
Nisbett, R. E., Krantz, D. H., Jepson, C., & Kunda, Z. (1983). The use of sta-
tistical heuristics in everyday inductive reasoning. Psychological Review, 90, 339–
363.
Osherson, D. N., Smith, E. E., Wilkie, O., López, A., & Shafir, E. (1990). Category-based
induction. Psychological Review, 97, 185–200.
Quinn, P. C., & Eimas, P. D. (2000). The emergence of category representations dur-
ing infancy: Are separate perceptual and conceptual processes required? Journal of
Cognition and Development, 1, 55–61.
Rakison, D. H., & Hahn, E. (2004). The mechanisms of early categorization and in-
duction: Smart or dumb infants? In R. Kail (Ed.), Advances in child development and
behavior, Vol. 32 (pp. 281–322). New York: Academic Press.
Rosch, E., Mervis, C. G., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic
objects in natural categories. Cognitive Psychology, 8, 382–439.
Shafto, P., & Coley, J. D. (2003). Development of categorization and reasoning in the
natural world: Novices to experts, naive similarity to ecological knowledge. Journal of
Experimental Psychology: Learning, Memory, & Cognition, 29, 641–649.
Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231–280.
Sloutsky, V. M., & Fisher, A. V. (2004). Induction and categorization in young children:
A similarity-based model. Journal of Experimental Psychology: General, 133, 166–
188.
Springer, K. (1992). Children’s awareness of the biological implications of kinship. Child
Development, 63, 950–959.
Springer, K. (2001). Perceptual boundedness and perceptual support in conceptual
development. Psychological Review, 108, 691–708.
P1: JZP
0521672443c02 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:27
54 Brett K. Hayes
Thompson, S., & Hayes, B. K. (2005). Causal induction in adults and children. Paper
presented at 32nd Australasian Experimental Psychology Conference, Melbourne
University, April.
Vygotsky, L. S. (1986). Thought and language. Cambridge, MA: MIT Press (original work
published 1934).
Waxman, S. R., & Booth, A. E. (2001). Seeing pink elephants: Fourteen-month-olds’
interpretations of novel nouns and adjectives. Cognitive Psychology, 43, 217–242.
Welder, A. N., & Graham, S. A. (2001). The influence of shape similarity and shared
labels on infants’ inductive inferences about nonobvious object properties. Child
Development, 72, 1653–1673.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
.
Like adults, children use categories as a basis for inductive inference. Having
learned that some property is true of some individual (e.g., “My dog, Magic,
likes marshmallows”), a child might assume both that other members of the
same category (dogs) share this property and that members of other, simi-
lar categories (e.g., cats) might also share this property. In short, inductive
inference may be guided by and reflect categorical relationships, hence the
term “category-based induction” or CBI. Cognitive and developmental re-
searchers have used a category-based induction paradigm not only to study
the use of categories in reasoning, but also to draw inferences from patterns of
reasoning about the nature of conceptual structures themselves (Atran et al.,
2001; Carey, 1985; Gelman & Markman, 1986; Gutheil, Bloom, Valderrama,
& Freedman, 2004; Gutheil, Vera, & Keil, 1998; Hatano & Inagaki, 1994; Ina-
gaki, 2002; Johnson & Carey, 1998; Medin & Smith, 1981; Ross, Medin, Coley,
& Atran, 2003; Waxman, Lynch, Casey, & Baer, 1997).
One of the most influential examples of the use of inductive projections
to draw inferences about conceptual organization comes from research in
the domain of folkbiology involving categories of living things, including
humans, nonhuman animals, and plants. Developmental evidence has re-
vealed, in particular, certain systematic asymmetries in inductive strength
among these categories. For example, researchers have found that children
are more willing to project properties from humans to dogs than from dogs
to humans. As we will see, one interpretation of this result is that children’s
biological knowledge is organized around humans as the prototype and that it
is more natural to generalize or project from the prototype to its variants than
from variants to the prototype (more about this later). There is even some
evidence that young, urban children may violate the principle of similarity
55
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
by generalizing more from humans to bugs than from bees to bugs (Carey,
1985).
In this chapter, our focus will be on asymmetries in children’s inductive pro-
jections. We begin by describing Carey’s studies and outlining the framework
within which she interpreted her findings. Subsequent work has questioned
the generality of Carey’s findings across populations and procedures, but rel-
atively little attention has been directed toward the theoretical basis for the
asymmetries that she observed. We point out that although Carey’s theoret-
ical position is consistent with the observed asymmetries, it is not unique
in this regard. We describe several other alternative interpretations and their
associated claims about conceptual organization and processing principles.
The upshot is that human-animal asymmetries underdetermine theoretical
accounts, and that further empirical constraints are needed.
We next continue to examine asymmetries but broaden their range to
include asymmetries involving humans, nonhuman mammals, plants, and
insects. We also expand the empirical base by drawing on category-based
induction in children from a range of cultures. The data show certain asym-
metries that appear to hold across populations. These additional sources of
evidence offer considerable leverage for evaluating alternative interpretations
of asymmetries. To foreshadow, from this broader view, we find that the
asymmetries in inductive inference are not well characterized as knowledge
or typicality effects favoring humans, as Carey has suggested. Instead, these
asymmetries seem to reflect the manner by which distinctive categories and
features become activated by the comparison processes associated with induc-
tive inferences. On this account, a major source of human-animal asymme-
tries is the ambiguous status of humans as members of the category “animal”
and as members of a category “human” that is contrastive with animal. The
distinctive features/categories account leads to further predictions that are
supported by children’s open-ended justifications of their responses. We close
with a discussion of implications of this newer interpretation of asymmetries
for the understanding of category structure, conceptual organization, and
conceptual development.
living and nonliving kinds as targets. The now-classic finding was that children
from four to six years of age willingly project novel properties from humans
to nonhuman animals but are reluctant to make the converse generalization
from nonhuman animals to humans. More specifically, young children who
were told that people have a little green thing inside them, called an omentum,
were quite willing to assert that dogs also have an omentum. In contrast, when
children were told that dogs have an omentum, they were reluctant to infer
that people also have an omentum. In general, the youngest children were very
reluctant to generalize from any base other than humans to other kinds. One
striking consequence of this tendency is that the youngest children were more
likely to generalize from humans to bugs than from bees to bugs, despite the
stronger perceptual similarity and taxonomic relation between the latter than
the former pair. Carey took this asymmetry as a reflection of the underlying
conceptual structure of the young child’s mind, in which biological knowledge
is organized around humans as the prototype humans. Let’s take a closer look.
An important assumption underlying Carey’s interpretation of her results
is that children’s development can be understood in terms of domain-specific
competences. At a minimum, one can distinguish between three domains
in which children form naı̈ve theories: physics, which is concerned with the
physical principles that govern the actions of objects in the world (e.g., Spelke,
1990); psychology, which deals with beliefs, desires, and intentions (Leslie,
1984; Wellman & Gelman, 1992); and biology, which revolves around knowl-
edge of plants and animals (Carey, 1985). One issue is whether these domains
are present at birth and become elaborated with development, or whether
they are wholly acquired. Carey and her colleagues argued that the pattern
of findings on biological induction supports the view that young children do
not have a distinct naı̈ve biology, but rather their reasoning reflects a naı̈ve
psychology where humans are the prototypical psychological entity. Only
later, when children are roughly seven to ten years of age, do they acquire a
distinct biology, in which humans are seen as one animal among many. Older
children do generalize from bases like “dog” and “bee,” and at this devel-
opmental point the asymmetries in reasoning are no longer present. Notice
that on this view, there is radical conceptual change, as children’s biological
reasoning shifts from an intuitive theory of psychology to a naı̈ve biology.
There are, however, lingering questions concerning Carey’s interpretation.
One set of questions concerns the conditions under which these asymmetries
in inductive reasoning arise. We touch on this only briefly in the current
chapter (see Ross et al., 2003; Gutheil et al., 1998; and Atran et al., 2001,
for a broader analysis of the generality of human animal asymmetries). More
central to this chapter are the questions concerning the underlying basis of the
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
A. Typicality Effects
The explanations included in this section are consistent with the view es-
poused by Carey. In essence, the argument is that the asymmetries in induc-
tive inference reflect an underlying conceptual structure in which humans are
considered the prototypical entity. There are two variants of this general view,
as described below, one relying on a central-tendency notion of typicality and
the other on typicality as based on ideals.
On the surface it does not appear that humans look more like other animals
than, say, a dog does. But if we allow frequency of exposure to examples of
the concept to bias typicality (see Barsalou, 1985, for evidence that frequency
of instantiation as a member of the category is the more relevant variable),
then humans should be the prototype, at least for urban children who may
have little experience with other animals.
How might typicality effect induction? Let’s examine how the similarity-
coverage model (SCM) of Osherson et al. accounts for typicality effects. True
to its name, the model has two components. The first is similarity. The greater
the similarity between the base and the target, the greater the confidence that
some novel property true of the base will be true of the target. The SCM
assumes that similarity is symmetrical (the similarity of A to B is the same
as the similarity of B to A), and therefore asymmetries do not arise from
the similarity component. Instead, it is the coverage component that leads to
typicality effects.
The coverage component works as follows: In addition to considering the
similarity of the base and target categories, the model assumes that partici-
pants generate examples of the lowest-level category that encompasses both
the base and target and then compute their similarity to the base. Because
participants tend to generate typical examples (even if sampling is random,
because a body of other work suggests that natural object categories have
a typicality structure; see Rosch & Mervis, 1975; Smith, Shoben, & Rips,
1974, for evidence and Smith & Medin, 1981, for a review), on average they
will be more similar to a typical than to an atypical base. Because inductive
confidence is assumed to increase with similarity, typical bases will support
stronger inductive inferences since they have greater similarity than atypi-
cal bases to the examples generated by the coverage component process. For
example, consider an inference from penguins to cardinals compared to an
inference from cardinals to penguins. Although the SCM treats similarity as
symmetrical, the latter inference is predicted to be stronger than the former.
The lowest-level superordinate in each case is “bird,” and participants should
think of specific (and typical) birds like robin, blue jay, and sparrow. Because
these are more similar to cardinals than to penguins, cardinal will be stronger
than penguin as an inductive base. In short, the SCM model predicts that
inference will be stronger for typical bases than for atypical bases and that the
coverage component of the model can generate asymmetries.
Although this coverage component accounts for asymmetries at a theoret-
ical level, it is hard to see how it would account for the human asymmetries
observed in young children. More specifically, it is difficult to see why “hu-
mans” would have better coverage than other mammals (e.g., “dog,” “wolf ”)
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
since humans are not especially similar to other mammals. We might need to
amend the SCM model to allow frequency to strongly affect the category ex-
amples generated and then argue that young children tend to think of humans
when generating examples of the superordinate category “animal.” (We note,
however, that recent evidence from our ongoing developmental studies sug-
gests that this is not the case.) A different strategy is to conceptualize humans
as a prototype in the ideal, rather than similarity, sense and to use something
other than the similarity-coverage model to account for the asymmetries. We
now turn to this option.
(as they may not be in some populations), then asymmetries should disappear.
All we need is some independent index of prototypicality to tie things down.
Once we have determined a typicality ordering, we should be able to predict
a range of asymmetries. In summary, from this perspective, there is nothing
special about humans with respect to generating asymmetries; the results
follow from the (ideal-based) typicality of humans. If we can establish a
ranking of idealness, then we should be able to predict a corresponding series
of asymmetries.
1. Knowledge Increases Projection. The first notion is that the more a par-
ticipant knows about some base, the more likely they are to project from it
to some target. This idea is endorsed by Kayoko Inagaki and Giyoo Hatano
(2001) and fits their observation that children who have experience raising
goldfish use goldfish as well as humans as a good base for inductive gener-
alization. The paradigm used by Inagaki and Hatano is somewhat different
from that employed by Carey, so one has to be careful about their interrela-
tionships. Inagaki and Hatano have shown that children who have knowledge
of certain biological properties will extend those properties to near neighbors.
We, however, do not know if we would find asymmetries if a novel property
were attributed to goldfish versus, say, a turtle. That is, we don’t know if an
inference from goldfish to turtle would be stronger than an inference from
turtle to goldfish. If such an asymmetry were found, it would provide strong
support for knowledge of a base driving induction.
There is also some more direct evidence consistent with the idea that knowl-
edge or familiarity with a base increases inductive projection (Atran et al.,
2001), and in the limiting case of complete ignorance about the kind in ques-
tion, it seems plausible. Nonetheless, an element is missing. It is not obvious
that increased familiarity or certainty should lead to a greater tendency to
generalize. After all, there is classic evidence from animal learning and con-
ditioning to suggest that increased experience (in the form of discrimination
training) tends to sharpen generalization gradients.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
have uniform prior odds, this may not be the case. Specifically, assuming
that people are experts on people, they may have lower prior odds for novel
(blank) properties attributed to people than for the same properties attributed
to another base. That is, participants should be more surprised to hear that
humans have an omentum inside (where omentum is a novel property) than
to hear that a raccoon does.
Two factors (at least) appear to support this intuition. The first involves
a “lack of knowledge inference” (Collins & Michalski, 1989). This would
run something like the following: “If humans had omenta inside, then I
would have heard about it, so it’s unlikely that they do.” The second and
related idea is that there may be a bias for assuming that novel properties are
associated with unfamiliar categories (Testa, 1975). If so, the asymmetries will
follow directly for two reasons. First, because (by hypothesis) the priors are
lower for humans than for other mammals, even if the premise (regarding
the target) were ignored entirely and therefore had no effect whatsoever on
beliefs about the target category, blank properties should be more broadly
attributed when the target is a nonhuman mammal than when it is a human.
(In the limiting case they might ignore the premise and simply base judgments
on prior odds for the conclusion.) Second, it is generally assumed that the
more informative or surprising a premise is, the greater its effect on a target
(Osherson, Blok, & Medin, 2005). We suspect that it is more surprising to
hear that humans have an omentum inside than to hear that some other
mammal does. Taken together, then, these two factors suggest that inferences
from humans to (other) mammals should be higher than inferences from
mammals to humans.
are also members of the more specific category “human” (contrasting with
nonhuman animals).1 In contrast, nonhuman animals do not have this dual
status. It is important to point out that the inductive strength of a given target
is diminished by salient categories it has that do not apply to the base. These
observations are relevant to inductive inference in the following way: When
a human is the base and a nonhuman animal is a target, the more inclusive
category “animal” gets activated, which tends to prime the broad sense of
“animal” that includes humans. In contrast, when a nonhuman animal is
the base and a human is the target, the distinctive category of “human” gets
activated in the target, and, as we noted above, this diminishes inductive
confidence to nonhuman animals. (Later on we’ll see that the same sort of
asymmetry may arise for plants at a more superordinate level. We suspect that
because animals may be prototypical living things, plants are more distinctive
members of the superordinate category “living thing” than are animals. As a
result, we suspect that when a plant is the base and an animal is a target, the
more inclusive category “living thing” gets activated, but when an animal is
the base and a plant is the target, the distinctive category of “plant” is more
strongly activated, and this diminishes inductive confidence to animals.)
What might count as evidence for this proposal? First, if this proposal is
correct, it should be evident in participants’ justifications for their inductions.
For example, participants should mention the distinctive feature of humans
(e.g., “people aren’t animals”) as a justification more frequently when a hu-
man is a target than when it is a base for induction. A second piece of evidence
involves cross-cultural comparisons. We know that there are cultures and lan-
guages in which the contrast between humans and animals is more clear-cut
than in urban and suburban English-speaking communities in the United
States. For example, in Indonesia these categories are kept quite distinct, and
the Indonesian names are mutually exclusive (Anggoro, Waxman, & Medin,
2005). That is, the Indonesian term for “animal” cannot be applied to humans.
In such cultural contexts, we would expect to find less inductive generaliza-
tion between humans and nonhuman animals, and, as a consequence, the
asymmetries should be diminished. See Anggoro et al. (2005) for evidence
that this is the case.
in the United States, humans certainly can be considered animals, but they
may also be considered to be a special case, namely, as animals with more
distinctive features than other animals (e.g., humans are special because they
talk, have beliefs, construct buildings, cook their food, etc.). The core idea,
then, is that inductive inferences from nonhuman animals to humans should
be diminished whenever the distinctive features of humans are activated, and
such features are activated more when humans are the target than when they
are the base. This follows from the idea that induction begins with a focus on
the target category. By logic, the same should hold for any animal, human or
nonhuman, whenever participants attribute more distinctive features to the
target than to the base. In other words, there is nothing special about humans
(save their distinctive features). If this is the case, then for any distinctive
category (e.g., “goldfish” for Inagaki and Hatano’s goldfish caretakers), gener-
alizing a novel property (e.g., from goldfish to frogs) should be stronger than
generalizations in the opposite direction (e.g., from frogs to goldfish). This
would follow if the young goldfish caretakers knew more distinctive features
about goldfish than about frogs.2
The account in terms of distinctive features of the target concept is not new.
For example, Sloman’s (1993) feature-based induction model assumes that
distinctive features of the target category reduce inductive confidence more
than the distinctive features of premise or base categories. Sloman (1993)
also reports a variety of data from undergraduate populations, including
asymmetries in inductive reasoning, supporting this assumption.
The feature-activation account shares similarities with the typicality ac-
count in that typicality is reduced by distinctive features. But note that the pre-
dictions are opposite in form – the feature activation story has it that distinc-
tive features of the target category reduce induction, and this implies a reverse
typicality effect. That is, when the premise or base is atypical and the target is
typical, inductive confidence should be greater than in the reverse ordering.
2 Note that, as in the assumption for category activation above, we are attributing to this view the
claim that the distinctive features of the target affect induction more than do distinctive features
of the base. This is the opposite of Tversky’s (1977) claim that it is the distinctive features of
the base that receive more weight in asymmetries (in similarity judgment). This point is worth
noting, but it is also worth noting that the induction paradigm is considerably different from a
similarity-judgment paradigm.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
4. Summary. We will stop with these three main classes of explanation for
asymmetries in induction, though there certainly are others.3 It should be
1. Causal associations produce the asymmetry. Many causal relations have directional implica-
tions for induction. If Grass has Enzyme x, we are more sure that Cows have it than we are
sure that Grass has Enzyme Y given that Cows have it (Medin et al., 2003). For this argument
to work, the causal associations should be more likely in the human to (other) animal case
than in the (other) animal to human case. One major problem with this explanation is that
among populations of children where we see this sort of reasoning (see Ross et al., 2003, for
details), the associations we see involving humans tend to have humans as an effect rather
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
than a cause (e.g., bees stinging humans). If this position has value, it may be in explaining
why human-animal asymmetries are not found in some cases.
2. Asymmetries are mediated by size differences between base and target. There are two moti-
vations for this idea. One is that both Itza’ Maya and tree experts sometimes explicitly refer
to relative size on induction tasks and argue that going from large to small (tree species)
is more likely than going from small to large (Medin et al., 1997). The other impetus for
worrying about size is that children may worry about it. Hatano and Inagaki (2003) observed
that children sometimes deny that ants have hearts on grounds that they are too small to
contain a heart. Note, however, that this example cuts the other way in that a property true
of ants will fit into human dimensions, but the converse is not true. Still, it may be that, in
induction, there is a big-to-small bias.
4 Humans are the exception to this generalization. Although humans have consistently been
presented as both bases and targets, the specific other mammal, insect, or plant that has been a
base is rarely also a target.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
Finally, and perhaps most interestingly, there are differences in the types of
participants included in the original Carey task as compared to more recent
work. Several recent studies have moved beyond the urban and suburban
populations to examine how children raised in rural environments, who
enjoy a more intimate contact with nature, reason about the biological world.
We have found that children raised in rural environments sometimes appeal
to “ecological reasoning,” focusing on potential mechanisms of transmission
of a property from one biological kind to another. For example, when they
are told that “bees have sacra inside them,” these children often infer that
bears must also have sacra, and they justify their responses by mentioning
that sacra could be transmitted when a bee stings a bear or when a bear eats
honey (Ross et al., 2003).5 The problem, however, is that when ecological
reasoning is combined with procedures where base and target items are not
counterbalanced, the problems loom larger. For example, “bee” may be a
better base for supporting ecological reasoning than “fly.” The bottom line is
that skeptics might want to defer judgments about the claims we make in the
next section.
5 We should also note that in the causal-reasoning literature, there are some well-documented
asymmetries in which it is easier to reason from cause to effect than vice versa (see Kahne-
man & Tversky, 1974, for asymmetries in causal judgment). In principle, at least some of the
explanations that we have been considering might apply here as well.
6 We also have considerable data from a study which follows up and expands on these studies
with additional populations including Yukatek Maya children, majority culture and Menominee
children from rural Wisconsin, and Chicago area urban and suburban children (see also Atran
et al., 2001). In most cases we also have adult data but not always in sufficient quantity to be
reported (yet). Although we do not present these data, they consistently support and expand
the claims we make here from published data.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
7 The asymmetry was clear for “peccary” as a base but essentially absent with “dogs” as a base. This
provides some support for knowledge or familiarity affecting induction. Nonetheless, familiarity
will not account for the other asymmetries even within the Atran et al. data.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
Finding 7. Insect to Plant Asymmetries. In the Ross et al. (2003) data, where
the trends are a bit variable, there doesn’t seem to be any clear overall asymme-
try. For the newer data with mixed native and exotic bases there is a consistent
pattern of reverse asymmetries (plant to insect being greater than insect to
plant), though in several cases the effects are small because there is little
overall generalization between insects and plants (especially compared to the
generalization seen in Ross et al., 2003).
mammal to mammals, and this was not the case. Instead, the ordering with
respect to asymmetries may conform better to a reverse typicality gradient,
suggesting that mammals are the most prototypical biological entity and
humans the least. If typicality alone cannot account for the underlying basis
for asymmetries, perhaps an account in which typicality is considered in
conjunction with similarity would fare better. The proposal would be that
humans are the ideal animal but, at the same time, are not especially similar
to other mammals. But this account seems to add a new parameter for every
data point it attempts to explain. Moreover, it provides no account of the plant
to mammal asymmetry (and the simultaneous absence of a human to plant
asymmetry). In short, the evidence from the broadened range of observations
undermines the view that typicality effects can account for the human-animal
asymmetries.
This pattern conforms closely with the data. So far this looks very good for
the distinctive features/categories position. The major limitation in what has
been presented so far is that we have relied on intuitive notions of distinctive
features and categories without providing any direct evidence bearing on
them. Justification data, to be presented next, do provide some converging
evidence for the distinctive features/categories account.
8 Importantly, this is true both for what one might call “alignable features” (Markman & Gentner,
1997) that reflect different values on a common dimension (e.g., number of legs) and for
nonalignable features (e.g., presence versus absence of a tail). This not only points to the
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
E. Summary
Overall, the induction data coupled with the justifications provide detailed
support for the distinctive features/categories account (see also Sloman, 1993).
They also highlight the importance of focusing on the target, the base, and the
relation between them. On this view, we suggest that humans have the most
distinctive features/categories, followed closely by plants, then insects and
mammals. Note that other explanations for the pattern of asymmetries can,
in principle, describe some of these trends, but they do so only by imposing
some implausible assumptions. For example, consider the alternative view
that differences in knowledge underlie the asymmetries. To capture the full
range of data, this view would have to make the dubious claim that children
robustness of the focus on the target object but also undermines the idea that children might
focus on the base for comparisons of alignable targets and the target for nonalignable targets. A
further problem for this two-process account is that this same focus on the target category also
holds for the mammal-mammal probes.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
have more knowledge about plants than mammals. The same difficulty holds
for accounts that appeal to ideals, histories of induction, or prior odds. Only
the distinctive features/categories proposal can account for the ordering of
asymmetries that emerge in the data.
.
What are the important implications of these findings? First, we suggest that
the human-animal asymmetries that have been observed in category-based
induction tasks do not bear on the status of bases as being either more
familiar or better examples than targets. More generally, we suggest that
accounts which focus primarily on the base are theoretically and empirically
insufficient, for they cannot account for the range of asymmetries that emerge
in children’s inductive inferences about the biological world.
We have argued that the most promising explanation for the observed
asymmetries in inductive inference is one that focuses on the target, the base,
and the relation between them. More specifically, we suggest that such an
account must incorporate not only the categories and features of the target,
but also the distinctive categories and features that emerge when the tar-
get is compared with the base. This more inclusive view can account for the
full pattern of asymmetries, the observation that nonhuman mammal to non-
human mammal inferences are stronger than human to nonhuman mammal
inferences, and the patterns of justification provided by children in category-
based induction tasks. When it comes to categories, it appears that mammals
are animals and little else, other than a specific kind of animal. Specific plants
may be both plants (as a contrast with animals) and living things. Humans
both contrast with and constitute animals. These categorical distinctions are
paralleled by corresponding differences in distinctive features as a function of
comparison direction.
A review of the literature reveals that, as is often the case, the seeds of this
account are evident in previous work. For example, the idea that distinctive
features of the target may be weighted more than distinctive features of the
base is directly taken from Sloman’s (1993) feature-based induction model.
As another example, we point out that at least one earlier model of induction,
the similarity-coverage model (Osherson et al., 1990), also drew on cate-
gories and relations among categories to account for patterns of inductive
inference.
However, the account that we have offered here, the distinctive cate-
gories/features account, goes beyond its progenitors in (at least) three ways.
First, it broadens the range of categories that are relevant for induction.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
returning to Carey’s original work, our data suggest that the human-animal
asymmetries observed in the induction paradigm do not indicate that humans
are typical or ideal animals. Instead the full range of evidence converges on the
opposite conclusion – that humans are not singled out for their prototypicality
as animals, but for their marked distinctiveness.
References
Anggoro, F., Waxman, S., & Medin, D. (2005). The effect of naming practices on children’s
understanding of living things. Paper presented at Cognitive Science 2005, Stresa,
Italy.
Atran, S., Medin, D., Lynch, E., Vapnarsky, V., Ucan Ek’, E., & Sousa, P. (2001). Folkbiology
doesn’t come from folkpsychology: Evidence from Yukatek Maya in cross-cultural
perspective. Journal of Cognition and Culture, 1, 1–40.
Baillargeon, R. (1994). How do infants learn about the physical world? Current Directions
in Psychological Science, 3(5), 133–140.
Barsalou, L. W. (1985). Ideals, central tendency, and frequency of instantiation as de-
terminants of graded structure in categories. Journal of Experimental Psychology:
Learning, Memory, & Cognition, 11(4), 629–654.
Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: Bradford Books.
Collins, A., & Michalski, R. (1989). The logic of plausible reasoning: A core theory.
Cognitive Science, 13, 1–49.
Gelman, S. A. (2003). The essential child: Origins of essentialism in everyday thought. New
York: Oxford University Press.
Gelman, S. A., & Markman, E. M. (1986). Categories and induction in young children.
Cognition, 23(3), 183–209.
Gleitman, L. R., Gleitman, H., Miller, C., & Ostrin, R. (1996). Similar, and similar
concepts. Cognition, 58(3), 321–376.
Goodman, N. (1983). Fact, fiction, and forecast. New York: Bobbs-Merrill. (Original work
published 1955.)
Gutheil, G., Bloom, P., Valderrama, N., & Freedman, R. (2004). The role of histor-
ical intuitions in children’s and adults’ naming of artifacts. Cognition, 91(1), 23–
42.
Gutheil, G., Vera, A., & Keil, F. C. (1998). Do houseflies think? Patterns of induction and
biological beliefs in development. Cognition, 66(1), 33–49.
Hatano, G., & Inagaki, K. (1994). Young children’s naive theory of biology. Cognition,
50(1–3), 171–188.
Hatano, G., & Inagaki, K. (2003). The formation of culture in mind: A sociocultural
approach to cognitive develoment. In J. Meheler, S. Carey, & L. L. Bonatti (Eds.),
Cognitive development and conceptual change. Cambridge, MA: MIT Press.
Hatano, G., & Inagaki, K. (2002). Young children’s thinking about the biological world.
New York: Psychology Press.
Inagaki, K., & Hatano, G. (2001). Chidren’s understanding of mind-body relationships.
In M. Siegal & C. Peterson (Eds.), Children’s understanding of biology and health.
Cambridge, UK: Cambridge University Press.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
Johnson, S., & Carey, S. (1998). Knowledge enrichment and conceptual change in
folkbiology: Evidence from Williams syndrome. Cognitive Psychology, 37(2), 156–
200.
Kahneman, D., & Tverksy, A. (1973). On the psychology of prediction. Psychological
Review, 80, 237–251.
Leslie, A. M. (1984). Infant perception of a manual pick-up event. British Journal of
Developmental Psychology, 2(1) (Mar 1984), 287–305.
Markman, A. B., & Gentner, D. (1997). The effects of alignability on memory. Psycho-
logical Science 8(4), 363–367.
McDonald, J., Samuels, M., & Rispoli, J. (1996). A hypothesis-assessment model of
categorical argument strength. Cognition, 59(2), 199–217.
Medin, D. L., & Atran, S. (2004). The native mind: Biological categorization and
reasoning in development and across cultures. Psychological Review, 111(4), 960–
983.
Medin, D. L., Coley, J. D., Storms, G., & Hayes, B. K. (2003). A relevance theory of
induction. Psychonomic Bulletin & Review, 10(3), 517–532.
Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity. Psychological
Review, 100(2), 254–278.
Medin, D. L., Lynch, E. B., Coley, J. D, & Atran, S. (1997). Categorization and reasoning
among tree experts: Do all roads lead to Rome? Cognitive Psychology, 32, 49–96.
Medin, D. L., & Smith, E. E. (1981). Strategies and classification learning. Journal of
Experimental Psychology: Human Learning & Memory, 7(4), 241–253.
Osherson, D. N., Smith, E. E., Wilkie, O., Lopez, A., & Shafir E. (1990). Category-based
induction. Psychological Review, 97(2), 185–200.
Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies in the internal structure
of categories. Cognitive Psychology, 7(4), 573–605.
Ross, N., Medin, D., Coley, J. D., & Atran, S. (2003). Cultural and experimental differences
in the development of folkbiological induction. Cognitive Development, 18(1), 25–
47.
Shipley, E. F. (1993). Categories, hierarchies, and induction. Psychology of Learning and
Motivation, 30, 265–301.
Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231–280.
Smith, E. E., & Medin, D. L. (1981). Categories and concepts. Cambridge, MA: Harvard
University Press.
Smith, E. E., Shoben, E. J., & Rips, L. J. (1974). Structure and process in semantic
memory: A featural model for semantic decisions. Psychological Review, 81(3), 214–
241.
Spelke, E. S. (1990). Principles of object perception. Cognitive Science, 14(1), 29–56.
Testa, T. J. (1975). Causal relationships and associative formation. Dissertation Abstracts
International, 35(12-B, Pt 1), 6147.
Tversky, A. (1977). Features of similarity. Psychological Review, 84(4), 327–352.
Waxman, S. R. (2005). Why is the concept “Living Thing” so elusive? Concepts, lan-
guages, and the development of folkbiology. In W. Ahn, R. L. Goldstone, B. C.
Love, A. B. Markman, & P. Wolff (Eds.), Categorization inside and outside the labora-
tory: Essays in honor of Douglas L. Medin. Washington, DC: American Psychological
Association.
P1: JZP
0521672443c03 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:30
Waxman, S. R., Lynch, E. B., Casey, K. L., & Baer, L. (1997). Setters and samoyeds:
The emergence of subordinate level categories as a basis for inductive inference in
preschool-age children. Developmental Psychology, 33(6), 1074–1090.
Wellman, H. M., & Gelman, S. A. (1992). Cognitive development: Foundational theories
of core domains. Annual Review of Psychology, 43(1), 337–375.
Wisniewski, E. J. (1997). When concepts combine. Psychonomic Bulletin & Review, 4(2),
167–183.
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
1 No offense to the people of Uzbekistan, whose country was chosen at random from the too-long
list of countries about which I know virtually nothing. The suggestion that Uzbekistani’s have
bad breath (or, as alleged below, that they eat salted fish, drink vodka, or don’t wear business
suits) is a figment of the author’s imagination.
2 Actually, you would be justified in thinking that its prevalence is slightly higher in Uzbekistan,
because you do have a sample – albeit a very small one of size 1 – of people from Uzbekistan
81
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
82 B. Rehder
that you actually know a little bit more about Uzbekistan. For example, imag-
ine that you know how Uzbekistanis traditionally dress, some characteristic
facial features, common religious beliefs, and so on. According to one promi-
nent theory, you will be more likely to think that bad breath is frequent among
Uzbekistanis if your new friend has many of these typical Uzbekistani traits,
as compared to if he is, say, wearing a Western-style business suit, endorses a
traditional western religion (e.g., Roman Catholicism), and so on.
Do you find this example compelling? Perhaps my intuitions have been
ruined by reading too many papers about induction, but I can’t say that I do. I
might be slightly more willing to endorse the generalization about bad breath
on the basis of typical or characteristic features, but not by much, and I think
the reason has to do with the second model of category-based generalization
illustrated by the next example. You’re chatting with the Uzbekistani, but
now, instead of knowing a few typical properties of people from that country,
you know a few of the causes of bad breath. Let’s say you know that eating
salted fish causes bad breath, and so too does drinking vodka. You then learn
from the Uzbekistani that salted fish is one of the staples of the Uzbekistani
diet and that vodka is consumed at every meal. Now suddenly the idea that
large numbers of Uzbekistanis frequently have bad breath doesn’t seem so far
fetched after all.
The first example suggesting that generalizations are based on an example’s
typicality is the province of the well-known Similarity-Coverage Model of
category-based induction (Osherson, Smith, Wilkie, Lopez, & Shafir, 1990)
discussed in numerous places throughout this volume (also see Sloman, 1993,
for a related model). For example, according to the Similarity-Coverage Model
(hereafter SCM), people will be more confident that all birds have some new
property (e.g., sesamoid bones) when given an example of a sparrow with
sesamoid bones as compared to a penguin, because sparrows are more typical
birds than penguins (and thus, on this account, provide more “coverage”
of the features of the bird category). Likewise, the model would predict
that you would generalize bad breath to Uzbekistanis more strongly if your
one example was typical and thus covers the “Uzbekistani” category more
completely. In addition to coverage, the second component of the SCM –
similarity – allows it to account for generalizations between categories that
are not hierarchically nested. For example, people will be more confident that
with bad breath. After all, if bad breath is as uncommon in Uzbekistan as everywhere else, it
is pretty surprising that the one Uzbekistani you get to meet just happens to be one of the few
that have bad breath. But all depends on your beliefs regarding the prior distribution over the
prevalence of bad breath. If you believe that either everybody in a country has bad breath or no
one does, then the single example would be enough to conclude in favor of the former.
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
blackbirds have sesamoid bones given the fact that crows do as compared
to sparrows, because (according to the SCM) blackbirds are more similar to
crows than sparrows.
In contrast, the second mode of reasoning, referred to here as generalization
as causal reasoning (hereafter GCR), focuses on the specific causal explana-
tions which might lead one to believe that most or all members of a particular
category display a particular property. Although it may be vague or incom-
plete, one often has at least some general idea of the type of causal mechanisms
which produce or generate particular properties, and this knowledge can be
used to estimate the prevalence of that property in some population (i.e.,
category of object). This view also explains why, despite the predictions of the
SCM, the influence of typicality may be uncompelling in certain cases. For
example, I usually attribute bad breath to a transient property – something
consumed at a recent meal for instance – and this makes me reluctant to
consider bad breath a stable property of even that individual person, to say
nothing of the person’s nationality, ethnicity, or any other group he or she
might belong to. It also leads me to believe that the fact that the person is or
is not displaying a typical property on some other dimension (their clothing
style) is irrelevant to the question of generalizing bad breath. Of course, to be
fair to the SCM, it was never intended to be a model of the mental processes
that arise when one has knowledge of the causal mechanisms which might
generate the to-be-generalized property. Famously, its predictions were in-
tended to apply only to “blank properties” (like sesamoid bones) about which
one has no prior knowledge. Nevertheless, the example illustrates how easy
it may be for its similarity-based processes to be supplanted by explanation-
based ones.
Moreover, it may often be the case that even when more typical examples
do support stronger generalization, they do so not because of typicality per se,
but rather because of the distinct pattern of reasoning that a typical example
may support. For example, if you, unlike me, did think that a Uzbekistani
dressed in traditional clothing justified a stronger generalization, I suspect you
did so because you recognized that (a) bad breath can be attributed to a stable
property such as the person’s diet, (b) because diets and ways of dress covary,
this diet (whatever it is) may be common to many Uzbekistanis, and therefore
(c) so too is bad breath. But although this makes you a deeper reasoner than
me, you are reasoning nonetheless, not just basing your judgment on some
unanalyzed notion of typicality.
This chapter is divided into three sections. The first will present the cur-
rent empirical evidence that people in fact engage in causal reasoning when
generalizing properties. This evidence will come both from studies testing
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
84 B. Rehder
category to another to the extent that the categories are similar.3 In particular,
the SCM does not allow for how such generalizations might depend on the
property itself, by virtue of, say, the specific causal mechanisms it is involved
in. Heit and Rubinstein showed otherwise. They found, for example, that
a behavioral property (e.g., travels in a zig-zag path) was generalized more
strongly from tunas to whales as compared to from bears to whales. This result
may have been expected on the basis of the fact that whales are more similar
to tunas than bears. But when the novel property was biological rather than
behavioral (e.g., a liver with two chambers that acts as one), it was generalized
more strongly from bears to whales instead. Why should this reversal arise
despite the fact that the categories involved (bears, tunas, and whales) were
unchanged, and thus so too were the similarity relations? One explanation
is that participants thought that bears and whales share biological properties
(such as two-chambered livers), because such properties are likely to arise
from causal mechanisms associated with their common biological category,
that is, mammals. Tunas and whales, on the other hand, are more likely to
share a survival behavior (traveling in a zig-zag path) because they are both
prey animals living in a common ecology (also see Gelman & Markman, 1986,
and Springer & Keil, 1989).4
Sloman (1994, 1997) has provided other examples of how a property will be
more strongly generalized when the causal history which leads to a property
in a base category is also present in the target. For example, he found that un-
dergraduates were more willing to project a feature like “hired as bodyguard”
from war veterans to ex-convicts, presumably because the underlying expla-
nation (fighting experience makes for a good bodyguard) for war veterans
also applies to ex-convicts. In contrast, they were less to willing to project the
property “unemployed,” apparently because the reasons for unemployment
of war veterans (physical injury, PTSD) does not apply to ex-convicts. Note
again that these different results obtained despite the fact that similarity was
held constant (in both cases war veterans were the base and ex-convicts were
the target).
Finally, Smith, Shafir, and Osherson (1993) also found that subjects appar-
ently engaged in a form of causal reasoning when generalizing hypothetical
86 B. Rehder
88 B. Rehder
90
High Similarity
Low Similarity
70
60
50
40
Many Few None
figure 4.1. Number of dependents of novel property. Results from Hadjichristidis et al.
(2004), Experiment 2.
target category not on the basis of any one characteristic of the feature (e.g.,
its centrality), but rather on the basis of one’s beliefs about the causal laws
which relate the feature to those of the target category.
The generalization as causal reasoning view (GCR) makes (at least) three
predictions regarding how causal knowledge supports category-based gener-
alizations. The first of these is that property generalization can be an instance
of diagnostic reasoning in which one reasons from the presence of a novel
property’s effects to the presence of the novel property itself. The second
prediction is that generalizations can reflect prospective reasoning, in which
one reasons from the presence of the causes of a novel property to infer the
presence of the property. The third prediction is that generalizations should
exhibit the basic property of extensional reasoning in which a novel prop-
erty will be more prevalent among category members to the extent its causes
and/or effects are prevalent. As it turns out, each of these predictions has
received empirical support.
5 Whether one finds multiple versus few symptoms more diagnostic of an underlying common
cause will depend on the details of the causal beliefs involved. If one believes there is no
other possible causes of the symptoms, then even just one symptom is sufficient to infer the
cause with certainty. Also relevant is whether the cause is deterministic (produces its effects
with probability 1) or probabilistic. A deterministic cause invariably produces all of its effects,
and thus the presence of the cause is ruled out if even one of its effects is absent. For a
probabilistic representation of causality see Rehder (2003a, b) and Cheng (1997). For discussion
of a deterministic versus probabilistic view of causaility, see Thagard (1999).
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
90 B. Rehder
100
90
Generalization Rating
80
70
60
50
40
30
20
10
Present No Cause Absent
figure 4.2. Cause of novel property. Results from Rehder (2006), Experiment 3.
be presented with a Rogo that had ceramic shocks, and then judged what
proportion of all Rogos had ceramic shocks. Properties like ceramic shocks
were chosen because they have no obvious causal connection with the other
known properties of Rogos (e.g., butane-laden fuel, carbon monoxide in the
exhaust, etc). But on other trials the novel property was described as being
caused by one of the Rogo’s characteristic features. For example, subjects
would be presented with a Rogo which had a typical property (e.g., hot en-
gine temperature) and a novel property (e.g., melted wiring) and told that the
typical property caused the novel one (“The melted wiring is caused by the
high-engine temperature”). A second Rogo was presented which either did
or did not have high-engine temperature, and subjects were asked whether it
had melted wiring.
The results, presented in Figure 4.2, shows that the novel property was
generalized very strongly when its cause was present in the target Rogo, very
weakly when it was absent, and with an intermediate rating when no infor-
mation about causal mechanism was provided. The results were the same
regardless of whether the category was an artifact (e.g., Rogos), a biological
kind, or a nonliving natural kind. Clearly, subjects reason forward (prospec-
tively) to infer a novel property’s presence from the presence of its causes as
readily as they reason backward (diagnostically) to infer a property’s presence
from its effects (also see Lassaline, 1996, and Wu & Gentner, 1998).
of a novel property depends on the prevalence of its causes and/or effects. For
example, the prevalence of a hormone in a target category shouldn’t depend on
just the number of physiological functions it causes, but also how widespread
those functions are amongst the population of target category members.
Similarly, the prevalence of melted wiring caused by hot engines in a brand of
automobile should depend on what proportion of those automobiles in fact
have hot engines.
Maya Nair and I decided to test this prediction by explicitly manipulating
the prevalence of a category feature which was the purported cause (or effect)
of a novel property (Nair, 2005). The same categories as in Rehder (2006) were
used (including Romanian Rogos), and subjects were asked what proportion
of all Rogos had a novel property on the basis of a single example of a Rogo
with that property. Two factors were manipulated as within-subjects variables.
The first was the base rate of the characteristic feature. Two randomly chosen
features of Rogos were described as occurring in 90% of Rogos, whereas the
other two were described as occurring in 60%. The second orthogonal factor
was whether the novel property was described as caused by or as the cause
of one of the characteristic features. The characteristic features, the novel
properties, and the causal relationships between the two are presented in
Table 4.1 for Rogos. For example, some subjects would be presented with
a Rogo which had all four characteristic features and a novel property (e.g.,
zinc-lined gas tank) and told that one of the typical properties caused the novel
one. (“Butane-laden fuel causes a zinc-lined gas tank. The butane interacts
with the chromium in the metal of the gas tank, which results in a thin
layer of zinc on the inside of the tank.”) Other subjects would be told that
the novel property was the cause of the typical property. (“A zinc-lined gas
tank causes the fuel to be butane-laden. The zinc prevents corrosion of the
tank, but interacts chemically with gasoline to produce butane.”) All subjects
would then rate what proportion of all Rogos possessed the novel property.
Each subject performed four generalization trials in which the four novel
properties in Table 4.1 were each presented once, either as a cause and effect
of a characteristic feature, and with the base rate of the characteristic feature
(e.g., butane-laden fuel) described as either 90% or 60%.
Generalization ratings from this experiment are presented in Figure 4.3A
as a function of whether the novel property was the cause or the effect and
the base rate of the typical property. There was a strong effect of base rate, as
subjects’ generalization ratings were much higher when the novel property was
causally related to a characteristic feature with a 90% versus a 60% base rate.
Moreover, this result obtained regardless of whether the characteristic feature
P1: IBE
0521672443c04
table 4.1. Features and causal relationship for Romanian Rogos, an artifical category
Characteristic Feature Novel Feature Characteristic Feature → Novel Feature Novel Feature → Characteristic Feature
Butane-laden fuel Zinc-lined gas tank Butane-laden fuel causes a zinc-lined gas tank. A zinc-lined gas tank causes the fuel to be
The butane interacts with the chromium in butane-laden. The zinc prevents corrosion
the metal of the gas tank, which results in a of the tank, but interacts chemically with
thin layer of zinc on the inside of the tank. gasoline to produce butane.
CUFX144-Feeney 0 521 85648 5
Loose fuel filter gasket Vibrations during A loose fuel filter gasket causes vibrations Vibration during breaking causes a loose fuel
breaking during braking. The fuel which leaks filter. The rattling caused by the vibrations
through the fuel filter gasket falls on one of eventually leads to the fuel filter gasket
the brake pads, causing abrasion which becoming loose.
results in the car vibrating while braking.
92
Hot engine temperature Thin engine oil Hot engine temperature causes thin engine Thin engine oil causes hot engine
oil. The oil loses viscosity after it exceeds a temperature. Thin oil does not provide
certain temperature. sufficient lubrication for the engine’s
moving parts, and the engine temperature
goes up as a result.
High amounts of Inefficient High amounts of carbon monoxide in the An inefficient turbocharger causes high
carbon monoxide turbocharger exhaust causes an inefficient turbocharger. amounts of carbon monoxide in the
in the exhaust As the exhaust leaves the engine it passes exhaust. An inefficient turbocharger fails to
through the turbocharger. The lower inject enough oxygen into the engine, and
density of carbon monoxide in the exhaust so excess carbon does not undergo
means that the turbocharger is not combustion.
sufficiently pressurized.
July 19, 2007
14:32
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
A
90
Generalization Rating 80
70
60
50 Novel As Cause
Novel As Effect
40
60 90
B
90
Generalization Rating
80
70
60
50 Novel As Cause
Novel As Effect
40
62.5 87.5
figure 4.3. Observed base rate. Results from Nair (2005).
was the cause or effect of the typical property. In other words, subjects readily
engage in the extensional reasoning which characterizes causal reasoning
more generally.
In a follow-up experiment, we asked whether these results would obtain
when the feature base rates were observed rather than given through explicit
instruction. Informal post-experiment interviews in the first experiment re-
vealed that the base rate manipulation was very powerful. Although general-
ization ratings were entered by positioning a slider on a computer-displayed
scale which was not labeled numerically (one end was simply labeled “None,”
meaning that no Rogos have the novel property; the other end was labeled
“All,” meaning they all do), a number of subjects reported trying to position
the marker at a point that they felt corresponded to 60% or 90%. That is, if
a novel property was causally related to a typical property with a base rate
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
94 B. Rehder
D1 D2 D3 D4
Rogos
R1 1 1 0 1
R2 1 1 0 1
R3 1 1 0 1
R4 1 1 1 0
R5 1 1 1 0
R6 1 1 1 0
R7 0 1 1 1
R8 1 0 1 1
Non-Rogos
NR1 0 0 1 0
NR2 0 0 1 0
NR3 0 0 1 0
NR4 0 0 0 1
NR5 0 0 0 1
NR6 0 0 0 1
NR7 1 0 0 0
NR8 0 1 0 0
of 90% (or 60%), many subjects felt that that property would be displayed
by exactly 90% (or 60%) of Rogos. To determine if the extensional reasoning
effect obtained only because the base rates were explicit, we performed a
replication of the first experiment in which the feature base rates were learned
implicitly through a standard classification-with-feedback task. In the first
phase of the experiment subjects were asked to learn to distinguish Romanian
Rogos from “some other kind of automobile.” They were presented with the
exemplars shown in Table 4.2, consisting of eight Rogos and eight non-Rogos.
In Table 4.2, a “1” stands for a typical Rogo feature (e.g., loose fuel filter gasket,
hot engine temperature, etc.), whereas each “0” stands for the opposite value
on the same dimension (e.g., tight fuel filter gasket, cool engine temperature,
etc.). The thing to note is that on two of the dimensions (Dimensions 1 and
2 in Table 4.1), the typical Rogo feature is very typical, because it occurs in
seven out of eight Rogos (or 87.5%). On the other two dimensions (3 and 4),
the characteristic feature is less typical, occurring in five out of eight Rogos
(or 62.5%). The structure of the contrast category (the non-Rogos) is the
mirror image of the Rogos.
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
The items in Table 4.2 were presented in a random order in blocks, subjects
classified each and received immediate feedback, and training continued until
subjects performed two blocks without error or reached a maximum of twenty
blocks. They then performed the same generalization task as in the previous
experiment in which whether the novel property was the cause or effect was
crossed with the base rate of the typical Rogo feature (87.5% or 62.5%). The
results of this experiment are presented in Figure 4.3B. The figure indicates
that subjects in this experiment, just like those in the first one, took the
base rate of the characteristic feature into account when generalizing the new
property. They did this even though the difference in base rates between
features was manifested in observed category members rather than provided
explicitly. That is, people take into account the prevalence of that characteristic
feature in the population of category members when generalizing the novel
property, and they do so regardless of whether a novel property is the effect
or cause of a characteristic feature. Note that in both experiments from
Nair (2005), the results were unaffected by whether the category was a novel
artifact, biological kind, or nonliving natural kind.
In summary, studies provide strong evidence that people are exhibiting
some of the basic properties of causal reasoning when generalizing properties.
When many of a novel property’s effects are present in a category, people
reason (diagnostically) to the presence of the property itself. People also
reason (prospectively) to the presence of the property as a function of whether
its causes are present or absent in the target. Finally, people are sensitive to
how prevalent a novel property’s cause (or effect) is among the population of
category members.
96 B. Rehder
98 B. Rehder
experimental factor was the typicality of the Rogo, which had either 1, 2,
3, or 4 characteristic features. A Rogo always possessed at least the character-
istic feature which was described as the cause of a novel property (e.g., hot
engine temperature in the case of melted wiring).
Based on the SCM, the prediction of course was that the generalization
of blank properties would strengthen with the number of characteristic fea-
tures. The critical question concerned whether typicality would also affect
the generalization of novel properties when a causal explanation was present.
The results are presented in Figure 4.4A as a function of whether a causal
explanation was provided and the typicality of the exemplar. As expected,
generalization ratings for blank properties increased as the exemplar’s typi-
cality (i.e., its number of characteristic features) increased, indicating that this
experiment replicated SCM’s standard typicality effect for blank properties.
However, Figure 4.4A indicates that this effect of typicality was reduced when
a causal explanation was provided.
One important question is whether the response pattern shown in Figure
4.4A is manifested consistently by all participants, or whether it arose as
a result of averaging over individuals with substantially different response
profiles. In fact, two subgroups of participants with qualitatively different
responses were identified, shown in Figures 4.4B and 4.4C. The subgroup in
Figure 4.4B produced higher induction ratings for the nonblank properties as
compared to the blanks, that is, they were more willing to generalize a novel
property when accompanied by a causal explanation. Importantly, however,
whereas ratings for blanks properties were sensitive to typicality, typicality
had no effect on the generalization of the nonblanks. In contrast, the second
subgroup in Figure 4.4C did not generalize nonblanks more strongly than
blanks, and both types of properties were equally sensitive to typicality. In
other words, when reasoners chose to base their responses on the causal
explanations (as evidenced by the nonblank’s higher ratings in Figure 4.4B),
there was no effect of typicality; when they chose to base their responses on
typicality, there was no effect of causal explanations (Figure 4.4C). Apparently,
the use of a causal explanation versus typicality is an all-or-none matter, with
reasoners using one strategy or the other, but not both.
A
90
Generalization Rating
80
Nonblank
70
60
Blank
50
40
30
1 2 3 4
B
90
Generalization Rating
80 Nonblank
70
60
50 Blank
40
30
1 2 3 4
C
90
Generalization Rating
80
70
Nonblank
60
Blank
50
40
30
1 2 3 4
figure 4.4. Number of features in source example. Results from Rehder (2006),
Experiment 1.
two exemplars were chosen so that both always had three features, so that their
typicality was held constant across low- and high-diversity trials.) Whether
the novel property had a causal explanation was manipulated orthogonally.
On the basis of the SCM, the prediction was that the generalization of blank
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
100 B. Rehder
100
Nonblank (Cause in target)
Generalization Rating
80
60
Blank
40
20
Nonblank (Cause not in target)
0
Low High
figure 4.5. Similarity of source and target. Results from Rehder (2006), Experiment 3.
did not appear. The important finding is that the projection of nonblank
properties was much less sensitive to the similarity of the base and target
exemplar. (Unlike the previous two experiments, the group-level pattern of
responding, shown in Figure 4.5, was exhibited by virtually all participants.)
Once again, it appears that the effect of causal explanations is to draw attention
away from features not involved in the explanation, with the result that the
two exemplars’ similarity becomes largely irrelevant to how strongly a novel
property is generalized from one to the other.6
At first glance it may seem that these results conflict with those from Had-
jichristidis et al. (2004) which showed that novel properties were projected
more strongly as a function of the similarity between the base and target
(Figure 4.1). Recall however that Hadjichristidis et al.’s claim is that similarity
will be used to estimate dependency structure when its presence in the target
category is otherwise uncertain. This situation was manifested in their ex-
periments by referring to a hormone’s dependents as nameless “physiological
functions,” making it impossible to determine whether those specific func-
tions were present in the target category. In contrast, in the current experiment
subjects were told exactly which category features were causally related to the
novel feature, a situation which apparently invited them to largely ignore how
the base and target might be similar on other dimensions.
6 Note that in this experiment the effect of similarity was not completely eliminated (as the effects
of typicality and diversity were eliminated in the first two experiments), as the generalization
rating of nonblank properties was on average 6 points higher (on a 100 point scale) when the
base and target were similar as compared to dissimilar, a difference which reached statistically
significance. Nevertheless, note that the magnitude of this similarity effect is vastly lower than
it is for blank properties (34 points). (See Lassaline, 1996, for related results.)
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
102 B. Rehder
In summary then, these three experiments support the claim that when a
causal explanation for a novel property is available, it often supplants simi-
larity as the basis for the generalization of that property. First, in the first ex-
periment all participants exhibited an effect of typicality for blank properties,
but half of those participants showed no sensitivity to typicality when the
novel property was accompanied with a causal explanation (nonblanks). Sec-
ond, in the second experiment half the participants exhibited sensitivity to
diversity for blank properties, but that sensitivity was completely eliminated
for nonblanks for all participants. Finally, in the third experiment virtually
all participants exhibited sensitivity to similarity for blanks, but this effect
was (almost) completely eliminated for nonblanks. Apparently, when people
note the presence of a causal explanation for a novel property, it often draws
attention away from the exemplars’ other features, making their similarity (or
typicality or diversity) largely irrelevant to the inductive judgment.
emerging theories of biological kinds led the older children to expect such
kinds to be more structured and constrained, and hence more homogenous.
Novel properties of biological kinds are generalized more strongly because
this expectation of homogeneity extends to new properties in addition to
existing one (also see Gelman et al., 1988; Shipley, 1993).
Additional evidence comes from studies investigating which level in a taxo-
nomic hierarchy supports the strongest inductions. Coley, Medin, and Atran
(1997) presented both American undergraduates and the Itza’ with a sub-
species (e.g., black vultures) that exhibited an unspecified disease and then
tested the degree to which that disease was generalized to vultures (a category
at the species or folk-generic level), birds (the life form level), or all animals (the
kingdom level). They found that for both groups the strength of generaliza-
tions dropped substantially when the target category was at the life form level
or higher. Again, biological species appear to be especially potent targets of
inductive generalizations (also see Atran, Estin, Coley, & Medin, 1997).
One notable aspect of these latter findings is the fact that American un-
dergraduates, like the Itza’, treated the species level as inductively privileged
despite the well-known result that for these individuals the basic level is nor-
mally one level higher in the hierarchy, namely, at the level of life form (e.g.,
tree or bird) rather than species (e.g., oak or robin) (Rosch, Mervis, Gray,
Johnson, & Boyes-Braem, 1976). One explanation that Coley et al. offer for
this discrepancy is that whereas the explicit knowledge that the American
students possess (which is tapped, for example, by the various tasks used by
Rosch et al., such as feature listing) is sufficient only for life forms to emerge
as identifiable feature clusters, their expectations are nevertheless that individ-
ual species are those biological categories with the most inductive potential.
(In contrast to the Americans, for the Itza’ the species level is both basic
and inductively privileged, a result Coley et al. suggest reflects their greater
explicit ecological knowledge as compared to the Americans.) They further
suggest that these expectations reflect an essentialist bias in which species are
presumed to be organized around a causally potent essence which determines
not only category membership, but also the degree to which category mem-
bers are likely to share (known and to-be-discovered) features (Gelman, 2003;
Medin & Ortony, 1989).
Can studies which experimentally manipulate the knowledge associated
with a category shed any light on the details of how interfeature theoret-
ical knowledge promotes generalizations? In particular, can we find direct
evidence that a category based on an underlying cause (i.e., an essence) pro-
motes generalizations?
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
104 B. Rehder
A.
F2
F4 F3
F4
B.
F1
F3 F4
F2
C.
F1 2 F3 F4
figure 4.6. Network topologies tested in Rehder and Hastie (2004), Experiment 2.
A. Common cause network. B. Common effect network. C. Chain network.
106 B. Rehder
A
100 Common Cause
90
Categorization Rating
80
70
60
50
40
30
20 Common Cause
10 Control
0
1000 1111
B
100 Common Effect
90
Categorization Rating
80
70
60
50
40
30
20 Common Effect
10 Control
0
0001 1111
C
100 Chain
90
Categorization Rating
80
70
60
50
40
30
20 Chain
10 Control
0
0101 1111
Exemplar
figure 4.7. Generalization ratings from Rehder and Hastie (2004), Experiment 1.
(and because F3 is absent even though its causes F2 is present). In other words,
it is not coherent categories that support stronger generalizations but rather
coherent category members, and this effect holds regardless of the topology
of the causal relations which make them coherent.7 As in my other studies
7 In a different phase of the experiment, Rehder and Hastie (2004) also asked the subjects to
estimate the degree of category membership of the same exemplars which displayed the novel
properties. Like the generalization ratings, category membership ratings for the most typical
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
reported in this chapter, these results held not just for artifacts like Rogos, but
also for biological kinds and for nonliving natural kinds.
These results have three implications for the received view that biological
kinds promote inductions because of the presumed presence of an essence.
First, it may not be a presumption of a single common cause (an essence)
which promotes inductions but rather a presumption of coherence, that is,
causal relations which link category features regardless of the specific topology
of those links. Second, this effect of coherence is not limited to biological kinds
but will apply equally well to other kinds (e.g., artifacts) when generalizers
have reason to believe that one of those kinds is coherent. Third, this effect
can be reversed when the specific category member which displays a novel
feature is incoherent in the light of the category’s causal laws.
Of course, it should be noted that our failure to discover any special in-
ductive potential of a common cause network as compared to the other two
networks may have obtained because our experimental materials failed to
fully elicit our subjects’ essentialist intuitions. For example, perhaps such
intuitions depend on the underlying nature of a category remaining vague
(e.g., even young children have essentialist intuition, even though they know
nothing of DNA; see Gelman, 2003), but in our categories the common
cause was an explicit feature (e.g., butane-laden fuel for Rogos). Or, per-
haps those explicit features clashed with our adult subjects’ own opinions
about the nature of the “essence” of a brand of automobile (the intention of
the automobile’s designer; see Bloom, 1998) or a biological species (DNA).
Nevertheless, these results raise the possibility that the inductive potential of
categories thought to arise from folk essentialism about biological kinds may
in fact arise from a more general phenomenon, namely, theoretical coher-
ence, which potentially applies to kinds of categories in addition to biological
kinds.
This chapter has reviewed three sorts of evidence related to how causal
knowledge is involved in property generalization. The first section presented
exemplars (i.e., 1111) were higher in the causal conditions than in the control condition, and
ratings for incoherent exemplars (e.g., 1000, 0001, and 0101 in the common cause, common
effect, and chain conditions, respectively) were lower than in the control condition. That is,
the causal knowledge which makes exemplars appear more (or less) coherent is reflected in
their perceived degree of category membership in addition to their propensity to support
generalizations. Overall, the correlations between generalization and category membership
ratings were .95 or higher.
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
108 B. Rehder
110 B. Rehder
References
Ahn, W. (1998). Why are different features central for natural kinds and artifacts? The
role of causal status in determining feature centrality. Cognition, 69, 135–178.
Ahn, W., Kim, N. S., Lassaline, M. E., & Dennis, M. J. (2000). Causal status as a
determinant of feature centrality. Cognitive Psychology, 41, 361–416.
Atran, S., Estin, P., Coley, J. D., & Medin, D. L. (1997). Generic species and basic levels:
Essence and appearance in folk biology. Journal of Ethnobiology, 17(1), 22–45.
Bailenson, J. N., Shum, M. S., Atran, S., Medin, D. L., & Coley, J. D. (2002). A bird’s eye
view: Biological categorization and reasoning within and across cultures. Cognition,
84, 1–53.
Bloom, P. (1998). Theories of artifact categorization. Cognition, 66, 87–93.
Cheng, P. (1997). From covariation to causation: A causal power theory. Psychological
Review, 104(2), 367–405.
Coley, J. D., Medin, D. L., & Atran, S. (1997). Does rank have its privilege? Inductive
inference within folkbiological taxonomies. Cognition, 64, 73–112.
Gelman, S. A. (1988). The development of induction within natural kinds and artifact
categories. Cognitive Psychology, 20, 65–95.
Gelman, S. A. (2003). The essential child: The origins of essentialism in everyday thought.
New York: Oxford University Press.
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
112 B. Rehder
Gelman, S. A., & Markman, E. M. (1986). Categories and induction in young children.
Cognition, 23, 183–208.
Gelman, S. A., & O’Reilly, A. W. (1988). Children’s inductive inferences with superordi-
nate categories: The role of language and category structure. Child Development, 59,
876–887.
Hadjichristidis, C., Sloman, S. A., Stevenson, R., & Over, D. (2004). Feature centrality
and property induction. Cognitive Science, 28(1), 45–74.
Heit, E. (2000). Properties of inductive reasoning. Psychonomic Bulletin & Review, 7(4),
569–592.
Heit, E., & Rubinstein, J. (1994). Similarity and property effects in inductive reasoning.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(2), 411–
422.
Lassaline, M. E. (1996). Structural alignment in induction and similarity. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 22(3), 754–770.
Lopez, A., Atran, S., Coley, J. D., Medin, D. L., & Smith, E. E. (1997). The tree of life:
Universal and cultural features of folkbiological taxonomies and inductions. Cognitive
Psychology, 32(3), 251–295.
Lopez, A., Gelman, S. A., Gutheil, G., & Smith, E. E. (1992). The development of
category-based induction. Child Development, 63, 1070–1090.
Medin, D. L., Coley, J. D., Storms, G., & Hayes, B. K. (2003). A relevance theory of
induction. Psychonomic Bulletin & Review, 10(3), 517–532.
Medin, D. L., & Ortony, A. (1989). Psychological essentialism. In S. Vosniadou & A.
Ortony (Eds.), Similarity and analogical reasoning (pp. 179–196). Cambridge: Cam-
bridge University Press.
Nair, M. (2005). The role of causal reasoning in category-based generalizations. Unpub-
lished honors thesis. New York University.
Osherson, D., Smith, E. E., Myers, T. S., & Stob, M. (1994). Extrapolating human
probability judgment. Theory and Decision, 36, 103–126.
Osherson, D. M., Smith, E. E., Wilkie, O., Lopez, A., & Shafir, E. (1990). Category-based
induction. Psychological Review, 97, 185–200.
Proffitt, J. B., Coley, J. D., & Medin, D. L. (2000). Expertise and category-based induction.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(4), 811–828.
Rehder, B. (2003a). Categorization as causal reasoning. Cognitive Science, 27(5), 709–748.
Rehder, B. (2003b). A causal-model theory of conceptual representation and catego-
rization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(6),
1141–1159.
Rehder, B. (2006). When similarity and causality compete in category-based property
induction. Memory & Cognition, 34, 3–16.
Rehder, B., & Hastie, R. (2004). Category coherence and category-based property in-
duction. Cognition, 91(2), 113–153.
Rehder, B., & Kim, S. W. (2006). How causal knowledge affects classification: A generative
theory of categorization. Journal of Experimental Psychology: Learning, Memory, &
Cognition, 32, 659–683.
Rips, L. J. (2001). Necessity and natural categories. Psychological Bulletin, 127(6), 827–
852.
Rosch, E. H., Mervis, C. B., Gray, W., Johnson, D., & Boyes-Braem, P. (1976). Basic
objects in natural categories. Cognitive Psychology, 8, 382–439.
P1: IBE
0521672443c04 CUFX144-Feeney 0 521 85648 5 July 19, 2007 14:32
Shafto, P., & Coley, J. D. (2003). Development of categorization and reasoning in the
natural world: Novices to experts, naive similarity to ecological knowledge. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 29(4), 641–649.
Shipley, E. F. (1993). Categories, hierarchies, and induction. In D. Medin (Ed.), The
psychology of learning and motivation (Vol. 30, pp. 265–301). New York: Academic
Press.
Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231–280.
Sloman, S. A. (1994). When explanations compete: The role of explanatory coherence
on judgments of likelihood. Cognition, 52, 1–21.
Sloman, S. A. (1997). Explanatory coherence and the induction of properties. Thinking
and Reasoning, 3(2), 81–110.
Sloman, S. A., Love, B. C., & Ahn, W. (1998). Feature centrality and conceptual coherence.
Cognitive Science, 22(2), 189–228.
Smith, E. E., Shafir, E., & Osherson, D. (1993). Similarity, plausibility, and judgments of
probability. Cognition, 49, 67–96.
Springer, K., & Keil, F. C. (1989). On the development of biologically specific beliefs:
The case of inheritance. Child Development, 60, 637–648.
Thagard, P. (1999). How scientists explain disease. Princeton, NJ: Princeton University
Press.
Wu, M. L., & Gentner, D. (1998). Structure in category-based induction. Paper presented
at the 20th Annual Conference of the Cognitive Science Society, Madison, WI.
P1: KNQ
0521672443c05 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:51
are best understood in terms of acute and chronic changes in the availability
of different kinds of knowledge.
of animals matched the kind of property they were asked to reason about (e.g.,
whales and bears have two-chamber livers, whales and tuna travel in zigzag
trajectories). These results suggest that the property of projection influenced
the kind of knowledge that was recruited to support inductive inferences:
anatomical properties made anatomical knowledge more available, whereas
behavioral properties made behavioral knowledge more available.
Ross and Murphy (1999) have shown similar effects of property on the se-
lective use of conceptual relations to guide inferences in the domain of food.
They established that most participants cross-classified food into two major
knowledge structures – taxonomic, based on shared features or composition,
and script, based on what situations a food is consumed in. Participants in
their study were asked to make biochemical or situational inferences about
triplets of food. Participants were taught that the target food, such as bagels,
had a biochemical property (enzyme) or situational property (eaten at an
initiation ceremony). They were then asked to project the property to one
of two alternatives, a taxonomic alternative such as crackers, or a script al-
ternative such as eggs. Ross and Murphy found that participants made more
taxonomic choices for biochemical properties and more script choices when
considering situational properties. It seems that the nature of the property
increased the availability of relevant knowledge about shared composition or
situational appropriateness of a food, ultimately producing different patterns
of induction for different kinds of properties.
Recent evidence also suggests that, like adults, children’s inductive general-
izations are sensitive to the property in question. Using a similar triad method
to that of Ross and Murphy (1999), Nguyen and Murphy (2003) found that
in the domain of food, seven-year-old children (but not four-year-olds) made
more taxonomic choices when reasoning about a biochemical property and
more script choices when reasoning about a situational property. That is, they
thought that a bagel would be more likely to share an enzyme with crackers
but be more likely to be eaten at a ceremony with eggs. This suggests that the
context provided by a property begins to mediate the differential availability
of knowledge from an early age.
Further evidence of children’s selective use of knowledge based on prop-
erty comes from the domain of biology. Coley and colleagues (Coley, 2005;
Coley, Vitkin, Seaton, & Yopchick, 2005; Vitkin, Coley, & Kane, 2005) asked
school-aged children to consider triads of organisms with a target species,
a taxonomic alternative (from the same superordinate class but ecologically
unrelated), and an ecological alternative (from a different taxonomic class
but related via habitat or predation). Children were taught, for example, that
a banana tree had a property (either a disease or “stuff inside”) and asked
P1: KNQ
0521672443c05 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:51
Medin et al. (2003) examine this idea with respect to two broad classes of
phenomena, causal relations and property reinforcement. We concentrate on
the latter. Medin et al. (2003) present several examples where increasing the
salience of specific relations among categories in an inductive argument leads
to violations of normative logic or of the predictions of similarity-based
models of inductive reasoning.
The first of these is non-diversity via property reinforcement, which predicts
that an argument with less diverse premises might be perceived to be stronger
than an argument with more diverse premises if the premise categories of the
more diverse argument reinforce a salient relation not shared by the con-
clusion category. For example, consider the following arguments:
From a strictly taxonomic point of view, polar bears (a mammal) and penguins
(a bird) provide better coverage of the conclusion category animal than polar
bears and antelopes, which are both mammals. Thus, a model based only on
taxonomic knowledge must predict Argument B to be stronger. However, the
salient property shared by polar bears and penguins – namely, adaptation to
a cold climate – renders plausible the possibility that Property X is related
to life below zero, and therefore might weaken the inference that all animals
would share the property. Indeed, Medin at al. (2003) find that subjects in
the United States, Belgium, and Australia on average rated arguments like A
stronger than arguments like B.1 This suggests that the salient property shared
by the premise categories in B cancels out the greater coverage they provide
of the conclusion category.
A second related phenomenon discussed by Medin et al. (2003) is conjunc-
tion fallacy via property reinforcement, which predicts that arguments with a
single conclusion category might be perceived to be weaker than arguments
with an additional conclusion category (a violation of normative logic because
a conjunctive statement cannot be more probable than one of the component
statements) if the second conclusion category reinforces a salient relation
shared by all categories. For example, consider the following arguments:
1 Though recent evidence suggests that this effect may not be particularly robust (Heit & Feeney,
2005).
P1: KNQ
0521672443c05 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:51
Normative logic requires that “cows have Property X” must be more likely
than “cows and pigs have Property X,” but Medin et al. (2003) found that par-
ticipants reliably rated arguments like D as more likely than arguments like
C. The addition of pigs might serve to increase the availability of the knowl-
edge about farm animals, and therefore strengthens Argument D relative to
Argument C.
Finally, Medin et al. (2003) discuss non-monotonicity via property rein-
forcement. Monotonicity is the idea that all else being equal, adding premise
categories that are proper members of the same superordinate as a conclusion
category should strengthen the argument (see Osherson et al., 1990). Medin
et al. (2003) predict that adding premises might weaken an argument if the
added categories reinforce a relation shared by premise categories but not the
conclusion category. For example, consider the following arguments:
These findings suggest that relations among premise categories may im-
pact the availability of different kinds of knowledge for guiding sponta-
neous inferences. Premise categories that share salient taxonomic relations
render such knowledge available and thereby increase the likelihood of
taxonomic generalizations. Likewise, premise categories that share salient
ecological relations increase the availability and likelihood of ecological
inferences.
were chosen so that one would clearly be stronger via diversity. In addition
to being randomly assigned to the animal or alcohol conditions, participants
were also randomly assigned to evaluate arguments about a chemical com-
ponent or about getting sick. Results showed clear evidence for inductive
selectivity when undergraduates were reasoning about alcohol. Specifically,
participants reasoning about alcohol showed differential use of taxonomic
knowledge as a function of property; these participants were more likely to
make diversity-based inferences about getting sick than about a chemical
component. Moreover, participants reasoning about alcohol also provided
different explanations for their choices as a function of property; they were
more likely to offer causal explanations for inferences about getting sick, but
more likely to offer taxonomic explanations for inferences about a chemical
component. In contrast, there was no evidence of inductive selectivity when
participants were reasoning about animals; neither the relative frequency of
diversity-based choices nor the type of explanations provided for those choices
varied for inferences about a chemical versus getting sick. These results are in
close accord with those of Shafto and Coley (2003) just described, and they
suggest that greater domain-specific experience may increase the potential
availability of multiple conceptual relations and therefore increase inductive
selectivity as a function of property being projected.
2 The authors also propose that category members may be differentially available; for example,
suggesting that robins are more available members of the category bird than turtledoves.
P1: KNQ
0521672443c05 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:51
conceptual relation reflects the availability of that knowledge; all else being
equal, more available knowledge requires less effort to access and use. Thus,
both the acute and chronic changes in availability reviewed above reflect –
from the perspective of relevance – acute and chronic changes in the effort
required to access a given set of conceptual relations.
In the spirit of both Barsalou (1982) and Medin et al. (2003), availability
makes two distinctions: chronic changes in availability of different kinds of
knowledge grounded in experience, and acute changes in availability as a
result of context. Chronic changes in availability account for why taxonomic
knowledge is less effortful than ecological knowledge for biological novices
and taxonomists but not for fishermen. Acute changes in availability reflect
the fact that context manipulates effort to access different knowledge (cf.
Heit & Bott, 2000). The interaction between chronic and acute changes in
availability determines the degree to which people show inductive selectivity.
Though we believe availability provides a coherent framework uniting
expertise and property differences in induction, we think that the true merit
of thinking about category-based induction in terms of availability will be
in the guidance it provides in moving research forward. In the next section,
we describe some recent work inspired by availability, and derive additional
novel (but yet untested) predictions from this proposal.
Availability in Action
We see two major challenges in the development of an availability-based
framework. The first is to identify what kinds of predictions can be generated
by understanding the relationship between knowledge and reasoning in terms
of availability, and to begin to test those predictions. The second is to explain
how chronic changes in availability arise with experience. In this section,
we will outline some initial studies addressing the first challenge and some
preliminary ideas which address the second.
To be useful, any framework must generate new hypotheses as well as de-
scribe existing results. In a recent set of studies, we investigated availability as a
possible explanation for the lack of inductive selectivity in novice populations
(Shafto, Coley, & Baldwin, 2005). To implicate availability, it is important to
show that having knowledge is not sufficient for inductive selectivity. Previ-
ous research (Shafto & Coley, 2003) suggests that context (novel diseases and
novel properties) did not elicit the selective use of taxonomic and ecological
knowledge in biological novices. One reason novices may not have demon-
strated inductive selectivity was a baseline difference in the availability of tax-
onomic and ecological knowledge. In a series of experiments (Shafto, Coley, &
Baldwin, 2005), we provided support for this claim by investigating the effects
P1: KNQ
0521672443c05 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:51
References
Anderson, J. R., & Milson, R. (1989). Human memory: An adaptive perspective. Psycho-
logical Review, 96, 703–719.
Baker, A., & Coley, J. D. (2005). Taxonomic and ecological relations in open-ended induc-
tion. Paper presented at the 27th Annual Conference of the Cognitive Science Society,
Stresa, Italy.
P1: KNQ
0521672443c05 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:51
Shafto, P., Kemp, C., Baraff, E., Coley, J. D., & Tenenbaum, J. B. (2005). Context-sensitive
induction. Paper presented at the 27th Annual Conference of the Cognitive Science
Society, Stresa, Italy.
Shafto, P., Kemp, C., Baraff, E., Coley, J. D., & Tenenbaum, J.B. (in prep). Inductive
reasoning with causal knowledge.
Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231–280.
Smith, E. E., Shafir, E., & Osherson, D. N. (1993). Similarity, plausibility, and judgments
of probability. Cognition, 49, 67–96.
Stepanova, O., & Coley, J. D. (2003). Animals and alcohol: The role of experience in
inductive reasoning among college students. Paper presented at the 44th Annual Meeting
of the Psychonomic Society, Vancouver, British Columbia.
Stepanova, O. (2004). Vodka and vermin: Naı̈ve reasoning about animals and alcohol.
Unpublished doctoral dissertation, Northeastern University.
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and
probability. Cognitive Psychology, 5, 207–232.
Vitkin, A. Z., Coley, J. D., & Hu, R. (2005). Children’s use of relevance in open-ended
induction in the domain of biology. Paper presented at the 27th Annual Conference of
the Cognitive Science Society, Stresa, Italy.
Vitkin, A., Coley, J. D., & Kane, R. (2005). Salience of taxonomic and ecological relations
in children’s biological categorization. Paper presented at the Biennial Meetings of the
Society for Research in Child Development, Atlanta, GA.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
In reality, all arguments from experience are founded on the similarity which we
discover among natural objects, and by which we are induced to expect effects similar
to those which we have found to follow from such objects. . . . From causes which
appear similar we expect similar effects.
David Hume, An Enquiry Concerning Human Understanding (1772)
1 See also Goodman (1955), Quine (1960). In the psychology literature, the matter has been
rehearsed by Tversky & Gati (1978), Osherson, Smith, & Shafir (1986), Medin, Goldstone, &
Gentner (1993), and Keil, Smith, Simons, & Levin (1998), among others.
137
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
evokes a vaguely physiological context but not much more. Thus, one may
have the intuition that cameras and computers do not require biotin but no
a priori sense of whether bears and wolves do. Nonetheless, the assertion that
bears require biotin, coupled with the similarity of wolves to bears, accords
strength to the foregoing argument. Theories that incorporate similarity as a
basis for induction – such as the Similarity-Coverage model – have been
able to account for a range of reasoning phenomena.4
Similarity-Coverage is unsuited to non-blank predicates like that figur-
ing in the following arguments;
Fieldmice often carry the parasite Floxum.
(1) (a)
Housecats often carry the parasite Floxum.
2 Linda was born in Tversky & Kahneman (1983). For recent debate about the fallacy she illustrates,
see Kahneman & Tversky (1996), Gigerenzer (1996), and Tentori, Bonini, & Osherson (2004),
along with references cited there.
3 A more cautious formulation would relativize blankness to a particular judge and also to the
objects in play. Degrees of blankness would also be acknowledged, in place of the absolute
concept here invoked. In what follows, we’ll rely on rough formulations (like the present one)
to convey principal ideas.
4 See Osherson, Smith, Wilkie, López, & E. Shafir (1990). An alternative account is offered in
Sloman (1993). Blank predicates first appear in Rips (1975). For examination of Similarity-
Coverage type phenomena in the context of natural category hierarchies, see Coley, Atran, &
Medin (1997) and Medin, Coley, Storms, & Hayes (2003). Developmental perspectives on
Similarity-Coverage are available in López, Gelman, Gutheil, & Smith (1992), Heit & Hahn
(1999), and Lo, Sides, Rozelle, & Osherson (2002). Similarity among biological categories may
be more resistant to contextual shifts than is the case for other categories (Barsalou & Ross,
1986). This would explain their popularity in many discussions of inductive inference.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
Many people find (1)a to be stronger than (1)b, no doubt because the relational
property characteristically ingests readily comes to mind (probably evoked by
mention of parasites). In contrast, ingestion is unlikely to be salient when
judging the similarity of fieldmice to housecats, preventing similarity from
predicting strength. The root of the problem is the difference in causal theories
that may occur to the reasoner when diverse sets of properties are evoked in
the two settings.5
For another limitation of Similarity-Coverage, consider the conclusion
that Labradors can bite through wire (of a given thickness and composition)
given the alternative premises (a) Collies can, versus (b) Chihuahuas can. More
people think that (b) provides better evidence for the conclusion than does (a),
despite the fact that Labradors are more similar to Collies than to Chihuahuas
(Smith, Shafir, & Osherson, 1993). This intuition likely derives from the belief
that Chihuahuas are less powerful than Collies, hence (b) is less likely to be
true. The example therefore suggests that a model of induction based solely
on similarity will fail when the probability of predicate-application is not
uniform across the objects in play.
The aim of the chapter is to show how the prior probabilities of state-
ments might be incorporated into a similarity-based model of induction.6
We attempt to achieve this by elaborating a new version of the “Gap Model”
advanced in Smith et al. (1993). We also suggest how probabilities and similar-
ities can be exploited to construct joint distributions over sets of propositional
variables. The first step in our endeavor is to characterize a class of predicates
that are neither blank nor epistemically too rich.
5 The Floxum example is based on López, Atran, Coley, Medin, & Smith (1997). It is exploited in
Lo et al. (2002) to argue that the “diversity principle” does not have the normative status often
assumed by psychologists. For the role of causal theories in commonsense reasoning, see Ahn,
Kalish, Medin, & Gelman (1995) and Lassaline (1996), along with the theoretical discussions in
Ortiz (1993) and Turner (1993).
6 Reservations about the role of similarity in theories of reasoning are advanced in Sloman & Rips
(1998). Overviews of theories of inductive reasoning are available in Heit (2000), Sloman &
Lagnado (2005).
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
mind when judging the strength of the argument compared to judging the
similarity of the objects figuring in it. To test the stability of an argument,
the mental potentiation of various predicates would need to be measured
during similarity judgment and inference. We do not offer a recipe for such
measurement but observe that stability is not circularly defined in terms that
guarantee the success of any particular model of inference (like ours, below).
Call a predicate “stable” if it gives rise to stable arguments.
For a further distinction, suppose that a is Dell Computer Corporation,
and c is HP/Compaq. If Q is increases sales next year, then Qa will strike many
reasoners as confirmatory of Qc , whereas if Q is increases market share next
year, thenQa will seem disconfirmatory ofQc . The similarity of a and c can be
expected to have different effects in the two cases. To mark the difference, we
qualify a predicate Q as monotonically increasing [respectively decreasing] for
objects O = {o 1 · · · o n } just in case Prob (Qx : Qy) ≥ Prob (Qx) [respectively,
Prob (Qx : Qy) ≤ Prob (Qx)] for all x, y ∈ O; and call Q monotonic for O if
Q is either monotonically increasing or decreasing for O.
Let us illustrate these ideas. The authors expect that for many people the
following predicates are monotonically increasing, and yield stable arguments
over the class of mammal species7 :
7 For another example, susceptibility to disease is usually perceived to be shared for species of the
same genus (Coley, Atran, & Medin, 1997).
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
merely be noted that few explicit, testable theories of inductive strength for
non-trivial classes of arguments are presently on offer. The stable predicates
in (2) suggest the richness of the class of elementary arguments. Success in
predicting their strength is thus a challenging endeavor and may point the
way to more general models.
Prob (±Qc )
(4) ±Qa/ ± Qc Prob (±Qa) Prob (±Qc : ± Qa)
sim(a, c )
8 Claims for asymmetry appear in Tversky (1977) but seem to arise only in unusual circumstances;
see Aguilar & Medin (1999). The assumption sim(o i , o j ) ∈ [0, 1] is substantive; in particular,
dissimilarity might be unbounded. This possibility is ignored in what follows. For a review of
ideas concerning asymmetries in inductive reasoning, see Chapter 3 by Medin & Waxman in
the present volume.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
(a) As Prob (Qa) approaches 1, Prob (Qc : Qa, Qb) approaches Prob (Qc :
Qb). Likewise, as Prob (Qb) approaches 1, Prob (Qc : Qa, Qb) ap-
proaches Prob (Qc : Qa).
(b) As Prob (Qa) and Prob (Qb) both approach 1, Prob (Qc : Qa, Qb) ap-
proaches Prob (Qc ).
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
(c) Other things equal, as Prob (Qa) and Prob (Qb) both decrease,
Prob (Qc : Qa, Qb) increases.
(d) As Prob (Qc ) approaches 1, so does Prob (Qc : Qa, Qb); as Prob (Qc )
approaches zero, so does Prob (Qc : Qa, Qb).
The reader can verify that (5) satisfies the qualitative conditions reviewed
sim(a,c )
above for Qa / Qc . To illustrate, as sim(a, c ) goes to 1, 1−
1+sim(a,c )
goes to 0,
α
hence α goes to 0, so Prob (Qc ) goes to 1 [hence, Prob (Qc : Qa) goes to 1].
For another example, as Prob (Qa) goes to 1, α goes to 1, so Prob (Qc )α goes
to Prob (Qc ), hence Prob (Qc : Qa) goes to Prob (Qc ) as desired. Of course,
(5) is not unique with these properties, but it is the simplest formula that
occurs to us [and restricts Prob (Qc : Qa) to the unit interval]. In the next sec-
tion, (5) will be seen to provide a reasonable approximation to Prob (Qc : Qa)
in an experimental context. It can be seen that our proposal is meant for
use with monotonically increasing predicates inasmuch as it guarantees that
Prob (Qc : Qa) ≥ Prob (Qc ).
The following formulas are assumed to govern one-premise elementary
arguments with negations. They satisfy commonsense requirements corre-
sponding to those discussed above.
(6) Prob (Qc : ¬Qa) = 1.0 − (1.0 − Prob (Qc ))α , where
1 − sim(a, c ) Prob(Qa)
α=
1 + sim(a, c )
(7) Prob (¬Qc : Qa) = 1.0 − Prob (Qc )α , where
1−Prob(Qa)
1 − sim(a, c )
α=
1 + sim(a, c )
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
In words, (10) claims that Prob (Qc : Qa, Qb) is given by the probability of
the dominant argument increased by a fraction of the probability 1 − Prob
(Qc : Qa) that the dominant argument “leaves behind.” The size of this frac-
tion depends on two factors, namely, the similarity between a and b (to avoid
redundancy), and the impact of the nondominant premise on the credibility
of Qc .
The constraints outlined earlier are satisfied by (10). For example, the
formula implies that Prob (Qc : Qa, Qb) ≈ 1 if sim(a, c ) ≈ 1. Note that
our proposal ensures that strength increases with extra premises, that is,
Prob (Qc : Qa, Qb) ≥ Prob (Qc : Qa), Prob (Qc : Qb). This feature is plausi-
ble given the restriction to monotonically increasing predicates.
We now list the other two-premise cases. They are predictable from (10)
by switching the direction of similarity and “the probability left behind” as
appropriate.
(11) Case 2. Arguments of form ¬Qa, Qb / Qc with ¬Qa dominant.
Arguments of formQa, ¬Qb /Qc with ¬Qb dominant are treated similarly.
graduates [of a given institution] earned an average salary of more than $50,000
a year in their first job after graduation.
graduates [of a given institution] earned an average salary of less than $50,000 a
year in their first job after graduation.
9 The predicate was slightly (but inessentially) different for these eighteen participants. Note that
we distinguish the order of two conditioning events; otherwise, there would be only thirty prob-
abilities of form Prob (Qc : Qa, Qb) based on five objects. No student received two arguments
differing just on premise order. The order in which information is presented is an important
variable in many reasoning contexts (Johnson-Laird, 1983) but there was little impact in the
present study.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
3.2 Results
Overall, the inputs and outputs for 200 different arguments were evaluated,
each by nine, eighteen, twenty-three, twenty-four, or forty-one undergradu-
ates. The analysis that follows is based on the mean estimates for each input
and output.
We computed the Pearson correlation between the values obtained for
Prob (Qc : ± Qa) or Prob (Qc : ± Qa, ±Qb) with the values predicted by
Gap2. The result is r = 0.94. The regression line has slope = 0.874, and in-
tercept = 0.094. See Figure 6.1. To gauge the role of the prior probability
Prob (Qc ) in Gap2’s performance, we computed the correlation between the
observed probabilities Prob (Qc : ± Qa) and Prob (Qc : ± Qa, ±Qb) versus
the predictions of Gap2 with Prob (Qc ) subtracted out. The correlation re-
mains substantial at r = 0.88 (slope = 0.811, intercept = 0.079).
Twenty arguments of form Qa / Qc were evaluated using objects (14)a–e,
and another twelve using (14)e–h. In thirty-one of these thirty-two cases, the
average probabilities conform to the monotonicity principle Prob (Qc : Qa) ≥
Prob (Qc ) thereby agreeing with Gap2. The situation is different in the two-
premise case. An argument of form Qa, Qb / Qc exhibits monotonicity if the
average responses yield the following:
(15) Prob (Qc : Qa, Qb) ≥ max{Prob (Qc : Qa), Prob (Qc : Qb)}.
1.00
0.80
0.60
Observed
0.40
0.20
0.00
0.00 0.20 0.40 0.60 0.80 1.00
Predicted
figure 6.1. Predicted versus observed probabilities for the 200 arguments.
about backing off Gap2’s monotonicity in this way. Averaging depicts the
reasoner as strangely insensitive to the accumulation of evidence, which be-
comes an implausible hypothesis for larger premise sets. Also, the prevalence
of nonmonotonic responding may be sensitive to procedural details. A pre-
liminary study that we have conducted yields scant violation of monotonicity
when premises are presented sequentially rather than two at a time.
The sixteen one-premise arguments were equally divided among the forms
Qa / Qc , Qa / ¬Qc , ¬Qa / Qc , and ¬Qa / ¬Qc . The sixteen two-premise ar-
guments were equally divided amongQa, Qb /Qc andQa, ¬Qb /Qc . Objects
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
0.80
0.70
0.60
0.50
Observed
0.40
0.30
0.20
0.10
0.00 0.20 0.40 0.60 0.80 1.00
Predicted
figure 6.2. Predicted versus observed probabilities for thirty-two arguments.
(i.e., the six mammals) were assigned to the thirty-two arguments so as bal-
ance their frequency of occurrence. Twenty college students at Northwestern
University rated the similarity of each of the fifteen pairs of objects drawn
from the list above. A separate group of twenty students estimated the condi-
tional probabilities of the thirty-two arguments plus the probabilities of the
six statements Qa. Data were collected via computer interface.
Since the similarity and probability estimates were collected from separate
groups of subjects there could be no contamination of one kind of judgment
by the other. The data from each group were averaged. The fifteen similarity
averages plus six averages for unconditional probabilities were then used
to predict the thirty-two average conditional probabilities via Gap2. The
correlation between predicted and observed values was .89. The results are
plotted in Figure 6.2.
More thorough test of Gap2 (and comparison to rivals) requires experi-
mentation with a range of predicates and objects. The arguments described in
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
here are homogeneous in the sense that the same predicate figures in each. We
expect Gap2 to apply equally well to heterogeneous collections of elementary
arguments (involving a multiplicity of predicates).
.
It is easy to envision extensions of Gap2 beyond elementary arguments [as
defined in (3)]. For example, the conditional probability associated with a
general-conclusion argument like
according to Gap2, where X ranges over the rodent species that come to the
reasoner’s mind.10 Another extension is to arguments in which neither the
object nor predicate match between premise and conclusion. The goal would
be to explain the strength of arguments like the following:
Both these extensions still require that the argument’s predicate(s) be stable
and monotonic for the single-premise case. Escaping this limitation requires
understanding how the same predicate can evoke different associations when
assessing strength compared to similarity. Specific causal information may
often be responsible for such a state of affairs. We acknowledge not (presently)
10 A “coverage” variable could also be added to the theory, in the style of the Similarity-Coverage
Model. Coverage for specific arguments is motivated by asymmetry phenomena such as the
greater strength of ‘Qcats / Qbats’ compared to ‘Qbats / Qcats’ (see Carey, 1985; Osherson et al.,
1990). Gap2 does not predict this phenomenon for any predicate Q such that Prob(Qcats) =
Prob (Qbats) (e.g., when Q is blank). Asymmetry is known to be a weak effect, however (see
Hampton & Cannon, 2004). If it is robust enough to merit modeling, this could also be achieved
by positing asymmetry in sim.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
Closing the six statements under boolean operators yields logically complex
statements like the following:
(17) (a) Eagles and Cardinals have muscle-to-fat ratio at least 10-to-1.11
(b) Ducks but not Geese have muscle-to-fat ratio at least 10-to-1.
(c) Either Hawks or Parakeets (or both) have muscle-to-fat ratio at least
10-to-1.
11 A more literal rendition of closure under boolean conjunction would be “Eagles have muscle-
to-fat ratio at least 10-to-1 and Cardinals have muscle-to-fat ratio at least 10-to-1.” We assume
throughout that the official logical structure of statements is clear from the abbreviations that
make them more colloquial. Note also that negation of predicates may often be expressed
without “not.” Thus, “have muscle-to-fat ratio below 10-to-1” serves as the negation of “have
muscle-to-fat ratio at least 10-to-1.”
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
(b) Eagles have muscle-to-fat ratio at least 10-to-1 assuming that Geese
don’t.
(c) Ducks and Geese have muscle-to-fat ratio at least 10-to-1 assuming
that Hawks do.
Judgments about chance begin to fade in the face of increasing logical com-
plexity, but many people have a rough sense of probability for hundreds of
complex and conditional structures like those in (17) and (18). We claim
(19) Estimates of the chances of complex and conditional events defined over
Q(o 1 ) · · ·Q(o n ) are often mentally generated from no more information
than:
(a) probability estimates for each of Q(o 1 ) · · ·Q(o n ); and
(b) the pairwise similarity of the n objects o i to each other.
Hawks and Eagles and Parakeets but neither Cardinals nor Geese nor Ducks have
muscle-to-fat ratio at least 10-to-1.
Prob is extended again to pairs (ϕ, ψ) of formulas such that Prob (ψ) > 0;
specifically:
Prob (ϕ ∧ ψ)
Prob (ϕ, ψ) = .
Prob (ψ)
12 See the references in note 2. Incoherence can be “fixed” via a method described in Batsell,
Brenner, Osherson, Vardi, & Tsvachidis (2002).
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
After the provisional chances are generated via f , the goal of the algorithm
is to build a genuine distribution that comes as close as possible to respecting
them, subject to the constraints imposed by the rated probability Ch(Q(o i ))
of each variable. We therefore solve the following quadratic programming
problem:
(20) Find a probability distribution Prob such that
{( f (α) − Prob (α))2 : α is a complete conjunction}
which the agent has particular information about the probabilities of various
events and conditional events.13
13 More generally, the constraints may be weak inequalities rather than equalities.
14 As sim(o i , o j ) approaches unity, Ch(c) should also approach max{Ch(Q(o i )), Ch(Q(o j ))}. Prior
to reaching the limit, however, the maximum might exceed Ch(c), which is probabilistically
incoherent. No such incoherence is introduced by the minimum.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
Case 4: c = ¬Q(o i ) ∧ Q(o j ). This case is parallel to the preceding one. Hence
15 The raw data for the multidimensional scaling algorithm were rated semantic similarities for
each pair of categories. Distance in the output reflects dissimilarity. (Included in the ratings was
semantic similarity to the superordinate concept bird.)
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
Statements. Each of the four combinations available from (21) and (22) gives
rise to a set of variables Q(o 1 ) · · ·Q(o 6 ). Thirty-six complex statements were
generated from the variables. The logical forms of the complex statements
were the same across the four sets of stimuli; they are shown in Table 6.2. To
illustrate, relative to (21)a and (22)a, the complex statement of form Q(o 6 ) ∧
¬Q(o 5 ) is (17)b, above. For the same set of stimuli, the form Q(o 2 )|¬Q(o 5 ) is
(18)b. In this way, each of the four stimulus-sets was associated with forty-two
statements (six variables plus thirty-six complex statements).
5.5 Results
The average estimates for the variables are shown in Table 6.3. The algorithm
(20) was used to derive a separate distribution Prob for each participant.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
Structure # Formulas
p∧q 7 Q(o 1 ) ∧ Q(o 2 ) Q(o 5 ) ∧ Q(o 6 ) Q(o 1 ) ∧ Q(o 5 )
Q(o 1 ) ∧ Q(o 6 ) Q(o 2 ) ∧ Q(o 5 ) Q(o 2 ) ∧ Q(o 6 )
Q(o 3 ) ∧ Q(o 4 )
p ∧ ¬q 7 Q(o 1 ) ∧ ¬Q(o 3 ) Q(o 1 ) ∧ ¬Q(o 4 ) Q(o 2 ) ∧ ¬Q(o 5 )
Q(o 2 ) ∧ ¬Q(o 6 ) Q(o 6 ) ∧ ¬Q(o 1 ) Q(o 1 ) ∧ ¬Q(o 2 )
Q(o 6 ) ∧ ¬Q(o 5 )
¬ p ∧ ¬q 7 ¬Q(o 4 ) ∧ ¬Q(o 3 ) ¬Q(o 6 ) ∧ ¬Q(o 3 ) ¬Q(o 6 ) ∧ ¬Q(o 5 )
¬Q(o 2 ) ∧ ¬Q(o 1 ) ¬Q(o 6 ) ∧ ¬Q(o 1 ) ¬Q(o 2 ) ∧ ¬Q(o 5 )
¬Q(o 3 ) ∧ ¬Q(o 5 )
p∨q 3 Q(o 2 ) ∨ Q(o 3 ) Q(o 4 ) ∨ Q(o 5 ) Q(o 6 ) ∨ Q(o 4 )
p:q 5 Q(o 1 ) : Q(o 2 ) Q(o 2 ) : Q(o 3 ) Q(o 3 ) : Q(o 2 )
Q(o 4 ) : Q(o 1 ) Q(o 5 ) : Q(o 2 )
p : ¬q 4 Q(o 1 ) : ¬Q(o 5 ) Q(o 2 ) : ¬Q(o 6 )
Q(o 3 ) : ¬Q(o 4 ) Q(o 6 ) : ¬Q(o 5 )
¬p : q 3 ¬Q(o 4 ) : Q(o 3 ) ¬Q(o 5 ) : Q(o 1 ) ¬Q(o 6 ) : Q(o 3 )
Note: Indices are relative to the orderings in (21)a, b. The symbols ∧ and ∨ denote
conjunction and disjunction, respectively; ¬ denotes negation. Conditional probabilities
are indicated with a colon (:).
16 Prob perfectly “predicts” the probabilities assigned to Q(o 1 ) · · ·Q(o 6 ) since the quadratic pro-
gramming problem (20) takes the latter values as constraints on the solution.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
(23) (a) Prob (ϕ) + Prob (ψ) − 1.05 ≤ Prob (ϕ ∧ ψ) ≤ Prob (ϕ) +
.05, Prob (ψ) + .05
(b) Prob (ϕ) − .05, Prob (ψ) − .05 ≤ Prob (ϕ ∨ ψ) ≤ Prob (ϕ) +
Prob (ψ) + .05
(c) Prob (ϕ ∧ ψ)/Prob (ψ) − .05 ≤ Prob (ϕ : ψ) ≤ Prob (ϕ ∧ ψ)/
Prob (ψ) + .05.
All twenty-four complex, absolute statements (see Table 6.2) are either
conjunctions or disjunctions, hence accountable to one of (23)a,b. Only four
of the twelve conditional statements could be tested against (23)c since in
eight cases the required conjunction ϕ ∧ ψ does not figure among the absolute
statements. For the averaged data, there are few violations of 23a,b – 6, 0, 1, and
1, respectively, for the four stimulus sets. In contrast, (23)c was violated all four
times in each stimulus set (averaged data), confirming the well documented
incomprehension of conditional probability by college undergraduates (see,
e.g., Dawes, Mirels, Gold, & Donahue, 1993). Incoherence was greater when
tabulated for each subject individually. Participants averaged 10.3 violations
of 23a,b out of 24 possible, and 3.8 violations of (23)c out of 4 possible.
Our results support Thesis (19) inasmuch as the only inputs to the algo-
rithm are similarities and the chances of variables. The algorithm’s predictions
are not perfect, however, and suggest that revised methods might prove more
successful. In particular, accuracy may improve when similarities are elicited
from respondents to the probability questions, instead of derived globally
from an ancient text (Rips et al., 1973).
1.00 1.00
0.80 0.80
Observed estimates
Observed estimates
0.60 0.60
0.40 0.40
0.20 0.20
0.00 0.00
0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00
Predicted estimates Predicted estimates
0.80 0.80
0.70 0.70
Observed estimates
0.60
Observed estimates
0.60
0.50 0.50
0.40 0.40
0.30 0.30
0.20 0.20
0.10 0.10
0.00 0.20 0.40 0.60 0.80 1.00 0.00 0.20 0.40 0.60 0.80 1.00
Predicted estimates Predicted estimates
Stimuli (21)b, (22)a Stimuli (21)b, (22)b
figure 6.3. Predicted versus observed probabilities for the four sets of stimuli.
18 A few times it was necessary to draw a new sample because one of the variables was not implied
by any of the chosen complete conjunctions.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
is one quarter the size of the original – was then used to predict the subjects’
average estimates of chance. The last column in Table 6.4 shows the mean of
the six correlations produced for each set of stimuli. Comparing these values
to the second column reveals some loss of predictive accuracy; substantial
correlations nonetheless remain. In a sophisticated implementation of our
algorithm, loss of accuracy could be partially remedied by applying quadratic
programming to several samples of complete conjunctions and averaging the
results.
.
Both Gap2 and QP f rely on similarity to predict the probabilities assigned
to complex statements. We hope that the performance of the models demon-
strates that similarity can play an explanatory role in theories of induction
provided that attention is limited to stable predicates. Extension beyond the
confines of stability is not trivial but might be facilitated by comparison to
the simpler case.
The importance of similarity to estimates of probability was already ar-
ticulated in the 17th and 18th century (see Cohen, 1980; Coleman, 2001).
Bringing quantitative precision to this idea would be a significant achieve-
ment of modern psychology.
Thanks to Robert Rehder for careful reading and helpful suggestions. Re-
search supported by NSF grants 9978135 and 9983260. Contact information:
[email protected], [email protected], [email protected].
References
Aguilar, C. M., and D. L. Medin (1999). “Asymmetries of comparison,” Psychonomic
Bulletin & Review, 6, 328–337.
Ahn, W., C. W. Kalish, D. L. Medin, & S. A. Gelman (1995). “The role of covariation
versus mechanism information in causal attribution,” Cognition, 54, 299–352.
Barsalou, L. W., and B. H. Ross (1986). “The roles of automatic and strategic process-
ing in sensitivity to superordinate and property frequency,” Journal of Experimental
Psychology: Learning, Memory, & Cognition, 12(1), 116–134.
Batsell, R., L. Brenner, D. Osherson, M. Vardi, and S. Tsvachidis (2002). “Eliminating in-
coherence from subjective estimates of chance,” Proceedings of the Ninth International
Workshop on Knowledge Representation. Morgan Kaufmann.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
López, A., S. Atran, J. Coley, D. Medin, and E. Smith (1997). “The tree of life: Uni-
versal and cultural features of folkbiological taxonomies and inductions,” Cognitive
Psychology, 32, 251–295.
López, A., S. A. Gelman, G. Gutheil, and E. E. Smith (1992). “The development of
category-based induction,” Child Development, 63, 1070–1090.
Luenberger, D. G. (1984). Linear and nonlinear programming (2nd edition), Addison-
Wesley, Reading MA.
Medin, D. L., J. D. Coley, G. Storms, and B. K. Hayes (2003). “A relevance theory of
induction,” Psychonomic Bulletin & Review 10(3), 517–532.
Medin, D. L., R. L. Goldstone, and D. Gentner (1993). “Respects for similarity,” Psycho-
logical Review, 100, 254–278.
Medin, D. L., and M. M. Schaffer (1978). “Context model of classification learning,”
Psychological Review, 85, 207–238.
Neapolitan, R. (1990). Probabilistic reasoning in expert systems: Theory and algorithms.
John Wiley & Sons, New York NY.
Nilsson, N. (1986). “Probabilistic logic,” Artificial Intelligence, 28(1), 71–87.
Ortiz, C. L. (1999). “A commonsense language for reasoning about causation and rational
action,” Artificial Intelligence, 111, 73–130.
Osherson, D., E. E. Smith, and E. Shafir (1986). “Some origins of belief,” Cognition, 24,
197–224.
Osherson, D., E. E. Smith, O. Wilkie, A. López, and E. Shafir (1990). “Category-based
induction,” Psychological Review, 97(2), 185–200.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems. Morgan Kaufmann, San
Mateo, CA.
Quine, W. V. O. (1960). Word and Object. MIT Press, Cambridge MA.
Rips, L. (1975). “Inductive judgments about natural categories,” Journal of Verbal Learn-
ing and Verbal Behavior, 14, 665–681.
Rips, L., E. Shoben, and E. Smith (1973). “Semantic distance and the verification of
semantic relations,” Journal of Verbal Learning and Verbal Behavior, 12, 1–20.
Russell, S. J., and P. Norvig (2003). Artificial intelligence: A modern approach (2nd edition).
Prentice Hall, Upper Saddle River NJ.
Shafir, E., E. Smith, and D. Osherson (1990). “Typicality and reasoning fallacies,” Memory
and Cognition, 18(3), 229–239.
Skyrms, B. (2000). Choice & chance: An introduction to inductive logic, Wadsworth,
Belmont CA.
Sloman, S. A. (1993). “Feature based induction,” Cognitive Psychology, 25, 231–280.
Sloman, S. A., and D. Lagnado (2005). “The problem of induction,” in Cambridge
handbook of thinking & reasoning, ed. by K. Holyoak and R. Morrison. Cambridge
University Press, Cambridge.
Sloman, S. A., and L. J. Rips (1998). “Similarity as an explanatory construct,” Cognition,
65, 87–101.
Smith, E. E., E. Shafir, and D. Osherson (1993). “Similarity, plausibility, and judgments
of probability,” Cognition, 49, 67–96.
Tentori, K., N. Bonini, and D. Osherson (2004). “The conjunction fallacy: A misunder-
standing about conjunction?” Cognitive Science, 28(3), 467–477.
Tentori, K., V. Crupi, N. Bonini, and D. Osherson (2007). “Comparison of confirmation
measures,” Cognition, 103, 107–119.
P1: d
0521672443c06 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:53
.
Philosophers since Hume have struggled with the logical problem of induc-
tion, but children solve an even more difficult task – the practical problem
of induction. Children somehow manage to learn concepts, categories, and
word meanings, and all on the basis of a set of examples that seems hopelessly
inadequate. The practical problem of induction does not disappear with ado-
lescence: adults face it every day whenever they make any attempt to predict an
uncertain outcome. Inductive inference is a fundamental part of everyday life,
and for cognitive scientists, a fundamental phenomenon of human learning
and reasoning in need of computational explanation.
There are at least two important kinds of questions that we can ask about
human inductive capacities. First, what is the knowledge on which a given
instance of induction is based? Second, how does that knowledge support gen-
eralization beyond the specific data observed: how do we judge the strength
of an inductive argument from a given set of premises to new cases, or infer
which new entities fall under a concept given a set of examples? We provide
a computational approach to answering these questions. Experimental psy-
chologists have studied both the process of induction and the nature of prior
knowledge representations in depth, but previous computational models of
induction have tended to emphasize process to the exclusion of knowledge
representation. The approach we describe here attempts to redress this im-
balance by showing how domain-specific prior knowledge can be formalized
as a crucial ingredient in a domain-general framework for rational statistical
inference.
The value of prior knowledge has been attested by both psychologists and
machine learning theorists, but with somewhat different emphases. Formal
analyses in machine learning show that meaningful generalization is not
167
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
possible unless a learner begins with some sort of inductive bias: some set of
constraints on the space of hypotheses that will be considered (Mitchell, 1997).
However, the best known statistical machine-learning algorithms adopt rela-
tively weak inductive biases and thus require much more data for successful
generalization than humans do: tens or hundreds of positive and negative
examples, in contrast to the human ability to generalize from just one or
few positive examples. These machine algorithms lack ways to represent and
exploit the rich forms of prior knowledge that guide people’s inductive in-
ferences and that have been the focus of much attention in cognitive and
developmental psychology under the name of “intuitive theories” (Murphy &
Medin, 1985). Murphy (1993) characterizes an intuitive theory as “a set of
causal relations that collectively generate or explain the phenomena in a do-
main.” We think of a theory more generally as any system of abstract principles
that generates hypotheses for inductive inference in a domain, such as hy-
potheses about the meanings of new concepts, the conditions for new rules, or
the extensions of new properties in that domain. Carey (1985), Wellman and
Gelman (1992), and Gopnik and Meltzoff (1997) emphasize the central role
of intuitive theories in cognitive development, both as sources of constraint
on children’s inductive reasoning and as the locus of deep conceptual change.
Only recently have psychologists begun to consider seriously the roles that
these intuitive theories might play in formal models of inductive inference
(Gopnik & Schulz, 2004; Tenenbaum, Griffiths, & Kemp, 2006; Tenenbaum,
Griffiths, & Niyogi, in press). Our goal here is to show how intuitive theo-
ries for natural domains such as folk biology can, when suitably formalized,
provide the foundation for building powerful statistical models of human
inductive reasoning.
Any familiar thing can be thought about in a multitude of ways, and
different kinds of prior knowledge will be relevant to making inferences
about different aspects of an entity. This flexibility poses a challenge for
any computational account of inductive reasoning. For instance, a cat is a
creature that climbs trees, eats mice, has whiskers, belongs to the category
of felines, and was revered by the ancient Egyptians – and all of these facts
could potentially be relevant to an inductive judgment. If we learn that cats
suffer from a recently discovered disease, we might think that mice also have
the disease; perhaps the cats picked up the disease from something they ate.
Yet if we learn that cats carry a recently discovered gene, lions and leopards
seem more likely to carry the gene than mice. Psychologists have confirmed
experimentally that inductive generalizations vary in such ways, depending
on the property involved. Our computational models will account for these
phenomena by positing that people can draw on different prior knowledge
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
Our work goes beyond previous formal models of induction, which either
do not address the rational statistical basis of people’s inferences or find
it difficult to capture the effects of different kinds of knowledge in different
inductive contexts, or both. In one representative and often-cited example, the
similarity-coverage model of Osherson, Smith, and colleagues, the domain-
specific knowledge that drives generalization is represented by a similarity
metric (Osherson et al., 1990). As we will see below, this similarity metric
has to be defined in a particular way in order to match people’s inductive
judgments. That definition appears rather arbitrary from a statistical point of
view, and arbitrarily different from classic similarity-based models of other
cognitive tasks such as categorization or memory retrieval (Nosofsky, 1986;
Hintzman et al., 1978). Also, the notion of similarity is typically context
independent, which appears at odds with the context-dependent nature of
human inductive reasoning. Even if we allow some kind of context-specific
notion of similarity, a similarity metric seems too limited a representation
to carry the richly structured knowledge that is needed in some contexts, or
even simple features of some reasoning tasks such as the strong asymmetry of
causal relations. In contrast, the knowledge that drives generalization in our
theory-based Bayesian framework can be as complex and as structured as a
given context demands.
The plan of this chapter is as follows. Section 2 provides a brief review
of the specific inductive tasks and phenomena we attempt to account for,
and Section 3 briefly describes some previous models that have attempted
to cover the same ground. Section 4 introduces our general theory-based
Bayesian framework for modeling inductive reasoning and describes two spe-
cific instantiations of it that can be used to model inductive reasoning in two
important natural settings. Section 5 compares our models and several alter-
natives in terms of their ability to account for people’s inductive judgments on
a range of tasks. Section 6 discusses the relation between our work and some
recent findings that have been taken to be problematic for Bayesian models
of induction. Section 7 concludes and offers a preview of ongoing and future
research.
.
This section reviews the basic property induction task and introduces the core
phenomena that our models will attempt to explain. Following a long tradition
(Rips, 1975; Carey, 1985; Osherson et al., 1990; Sloman, 1993; Heit, 1998), we
will focus on inductive arguments about the properties of natural categories,
in particular biological species categories. The premises of each argument state
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
that one or more specific species have some property, and the conclusion (to
be evaluated) asserts that the same property applies to either another specific
species or a more general category (such as all mammals). These two kinds of
arguments are called specific and general arguments, respectively, depending
only on the status of the conclusion category.
prop
We use the formula P1 , . . . Pn −−→ C to represent an n-premise argument
where Pi is the i th premise, C is the conclusion, and prop indicates the
property used. We will often abbreviate references to these components of an
argument. For example, the argument
Gorillas have T4 hormones
Squirrels have T4 hormones
All mammals have T4 hormones
T4
might be represented as gorillas, squirrels −→ mammals.
The most systematic studies of property induction have used so-called
“blank properties.” For arguments involving animal species, these are proper-
ties that are recognized as biological but about which little else is known – for
example, anatomical or physiological properties such as “has T4 hormones”
or “has sesamoid bones.” As these properties are hardly “blank” – it is impor-
tant that people recognize them as deriving from an underlying and essential
biological cause – we will instead refer to them as “generic biological proper-
ties.”
For this class of generic properties, many qualitative reasoning phenomena
have been described: Osherson et al. (1990) identify thirteen and Sloman
(1993) adds several others. Here we mention just three. Premise-conclusion
similarity is the effect that argument strength increases as the premises be-
T4
come more similar to the conclusion: for example, horses −
→ dolphins is
T4
weaker than seals −
→ dolphins. For general arguments, typicality is the effect
that argument strength increases as the premises become more typical of
T4
the conclusion category. For example, seals −
→ mammals is weaker than
T4
horses −→ mammals, since seals are less typical mammals than horses. Finally,
diversity is the effect that argument strength increases as the diversity of the
T4
premises increases. For example, horses, cows, rhinos −
→ mammals is weaker
T4
than horses, seals, squirrels −
→ mammals.
Explaining inductive behavior with generic biological properties is a chal-
lenging problem. Even if we find some way of accounting for all the phenom-
ena individually, it is necessary to find some way to compare their relative
weights. Which is better: an argument that is strong according to the typicality
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
.
The tradition of modeling property induction extends at least as far back as
the work of Rips (1975). Here we summarize a few of the more prominent
mathematical models that have been developed in the intervening thirty years.
worse than MaxSim in judging the strength of general arguments (see Section
5). Since Osherson et al. (1990) do not explain why different measures of
setwise similarity are needed in these different tasks, or why SumSim per-
forms so much worse than MaxSim for inductive reasoning, the SCM is less
principled than we might like.
1 Note that a feature-based version of the SCM is achieved if we define the similarity of two objects
as some function of their feature vectors. Section 5 assesses the performance of this model.
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
that can capture two important and very different kinds of domain knowledge
and that can strongly predict people’s judgments in appropriately different
inductive contexts.
p(y has Q|X) = p(y has Q, h|X) (1)
h∈H
= p(y has Q|h, X) p(h|X). (2)
h∈H
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
Now p(y has Q|h, X) equals one if y ∈ h and zero otherwise (independent
of X). Thus
p(y has Q|X) = p(h|X) (3)
h∈H:y∈h
p(X|h) p(h)
= (4)
h∈H:y∈h
p(X)
p(h)
h∈H:y∈h,h consistent with X
p(y has Q|X) = (5)
p(h)
h∈H:h consistent with X
2 More complex sampling models could be appropriate in other circumstances, and are discussed
in Tenenbaum & Griffiths (2001) and Kemp & Tenenbaum (2003). For instance, an assumption
that examples are randomly drawn from the true extension of Q might be particularly important
when learning word meanings or concepts from ostensive examples (Tenenbaum & Xu, 2000;
Xu & Tenenbaum, in press).
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
could be formalized as
p(h)
h∈H:Y ⊂h,h consistent with X
p(Y has Q|X) = (6)
p(h)
h∈H:h consistent with X
Note that a Bayesian approach needs no special purpose rules for deal-
ing with negative evidence or arguments with multiple premises. Once the
prior distribution p(h) and the likelihood p(X|h) have been specified, com-
puting the strength of a given argument involves a mechanical application
of the norms of rational inference. Since we have assumed a simple and
domain-general form for the likelihood above, our remaining task is to spec-
ify appropriate domain-specific prior probability distributions.
4.2.1 A Theory For Generic Biological Properties, The prior distribution for
default biological reasoning is based on two core assumptions: the taxonomic
principle and the mutation principle. The taxonomic principle asserts that
species belong to groups in a nested hierarchy, and more precisely, that the
taxonomic relations among species can be represented by locating each species
at some leaf node of a rooted tree structure. Tree-structured taxonomies
of species appear to be universal across cultures (Atran, 1998), and they
also capture an important sense in which species are actually related in the
world: genetic relations due to the branching process of evolution. Outside of
intuitive biology, tree-structured taxonomies play a central role in organizing
knowledge about many systems of natural-kind and artifact categories (Rosch,
1978), as well as the meanings of words that label these categories (Markman,
1989; Tenenbaum & Xu, 2000).
The structures of people’s intuitive taxonomies are liable to deviate from
scientific phylogenies in non-negligible ways, since people’s theories are based
on very different kinds of observations and targeted towards predicting dif-
ferent kinds of properties. Hence we need some source of constraint besides
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
(d) (e)
elephant
rhino
horse
cow
chimp
gorilla
mouse
dolphin
seal
squirrel
figure 7.1. (a) A folk taxonomy of mammal species. (b–e) Examples of mutation his-
tories.
Multiple independent occurences of the same property will be rare (e.g., the
mutation history in Figure 7.1b is more likely than in 7.1d, which is more
likely than in 7.1e). Hence the prior favors simpler hypotheses corresponding
to a single taxonomic cluster, such as {dolphins, seals}, over more complex
hypotheses corresponding to a union of multiple disjoint taxa, such as the
set {dolphins, seals, horses, cows, mice}. The lower the mutation rate λ, the
greater the preference for strictly tree-consistent hypotheses over discon-
nected hypotheses. Thus this model captures the basic insights of simpler
heuristic approaches to taxonomic induction (Atran, 1995) but embeds them
in a more powerful probabilistic model that supports fine-grained statistical
inferences.
Several caveats about the evolutionary model are in order. The mutation
process is just a compact mathematical means for generating a reasonable
prior for biological properties. We make no claim that people have con-
scious knowledge about mutations as specifically biological phenomena, any
more than a computational vision model which appeals to an energy func-
tion claims that the visual system has explicit knowledge about energy. It is
an open question whether the biological principles guiding our model are
explicitly represented in people’s minds or only implicitly present in the in-
ference procedures they use. We also do not claim that a mutation process is
the only way to build a prior that can capture generalizations about generic
biological properties. The key idea captured by the mutation process is that
properties should vary randomly but smoothly over the tree, so that categories
nearby in the tree are more likely to have the same value (present or absent)
for a given property than categories far apart in the tree. Other stochas-
tic processes including diffusion processes, Brownian motion, and Gaussian
processes will also capture this intuition and should predict similar patterns
of generalization (Kemp & Tenenbaum, submitted). The scope of such “prob-
abilistic taxonomic” theories is likely to extend far beyond intuitive biology:
there may be many domains and contexts where the properties of categories
are well described by some kind of smooth probability distribution defined
over a taxonomic-tree structure, and where our evolutionary model or some
close relative may thus provide a compelling account of people’s inductive
reasoning.
(a) (b)
figure 7.2. One simulated sample from the causal-transmission model, for the foodweb
shown in Figure 7.4a. (a) Initial step showing species hit by the background rate (black
ovals) and active routes of transmission (black arrows). (b) Total set of species with
disease via background and transmission.
for reasons unrelated to the foodweb, and that four of the causal links are
active (Figure 7.2a). An additional three species contract the disease by eating
a disease-ridden species (Figure 7.2b). Reflecting on these simulations should
establish that the prior captures two basic intuitions. First, species that are
linked in the web by a directed path are more likely to share a novel property
than species which are not directly linked. The strength of the correlation
between two species’ properties decreases as the number of links separating
them increases. Second, property overlap is asymmetric: a prey species is more
likely to share a property with one of its predators than vice versa.
Although the studies we will model here consider only the case of disease
transmission in a food web, many other inductive contexts fit the pattern of
asymmetric causal transmission that this model is designed to capture. Within
the domain of biological species and their properties, the causal model could
also apply to reasoning about the transmission of toxins or nutrients. Outside
of this domain, the model could be used, for example, to reason about the
transmission of lice between children at a day care, the spread of secrets
through a group of colleagues, or the progression of fads through a society.
the structure and stochastic process that are appropriate for reasoning about
a certain domain of entities and properties: generic biological properties,
such as anatomical and physiological attributes, are distributed according to
a noisy mutation process operating over a taxonomic tree; diseases are dis-
tributed according to a noisy transmission process operating over a directed
food web.
For a fixed set of categories and properties, the lower level of the theory is
all we need to generate a prior for inductive reasoning. But the higher level
is not just a convenient abstraction for cognitive modelers to talk about – it
is a critical component of human knowledge. Only these abstract principles
tell us how to extend our reasoning when we learn about new categories,
or a whole new system of categories in the same domain. When European
explorers first arrived in Australia, they were confronted with many entirely
new species of animals and plants, but they had a tremendous head start
in learning about the properties of these species because they could apply
the same abstract theories of taxonomic organization and disease transmis-
sion that they had acquired based on their European experience. Abstract
theories appear to guide childrens’ conceptual growth and exploration in
much the same way. The developmental psychologists Wellman and Gelman
(1992) distinguish between “framework theories” and “specific theories,” two
levels of knowledge that parallel the distinction we are making here. They
highlight framework theories of core domains – intuitive physics, intuitive
psychology, and intuitive biology – as the main objects of study in cognitive
development. In related work (Tenenbaum, Griffiths & Kemp, 2006; Tenen-
baum, Griffiths, & Niyogi, in press), we have shown how the relations between
different levels of abstraction in intuitive theories can be captured formally
within a hierarchical probabilistic model. Such a framework allows us to
use the same Bayesian principles to explain both how theories guide induc-
tive generalization and how the theories themselves might be learned from
experience.
Each of our theory-based models was built by thinking about how some
class of properties is actually distributed in the world, with the aim of giving
a rational analysis of people’s inductive inferences for those properties. It
is therefore not surprising that both the evolutionary model and the causal
transmission model correspond roughly to models used by scientists in rel-
evant disciplines – formalisms like the causal transmission model are used
by epidemiologists, and formalisms like the evolutionary model are used in
biological classification and population genetics. The correspondence is of
course far from perfect, and it is clearest at the higher “framework” level of
abstraction: constructs such as taxonomic trees, predator-prey networks, the
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
Evolutionary
Strict tree
Raw features
MaxSim
SumSim
all the animals. The prior p(h) assigned to any hypothesis h for the extension
of a novel property is proportional to the number of columns of F θ (i.e., the
number of familiar features) that are distributed as h specifies – that apply to
just those categories that h posits. Hypotheses that do not correspond to any
features in memory (any column of F θ ) receive a prior probability of zero.
All of these models (except the strict-tree model) include a single free pa-
rameter: the mutation rate in the evolutionary model, the balance between
similarity and coverage terms in MaxSim and SumSim, or the feature thresh-
old θ in the raw-feature model. Each model’s free parameter was set to the
value that maximized the average correlation with human judgments over all
five datasets.
Figure 7.3 compares the predictions for all five of these models on all
five datasets of argument strength judgments. Across these datasets, the
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
animals in the first set but only some of the animals in the second set. None
of the features in our dataset is strongly associated with dolphins, chimps,
and squirrels, but not also seals and horses. The raw-feature model therefore
finds it hard to discriminate between these two sets of premises, even though
it seems intuitively that the first set provides better evidence that all mammals
have the novel property.
More generally, the suboptimal performance of the raw-feature model
suggests that people’s hypotheses for induction are probably not based strictly
on the specific features that can be retrieved from memory. People’s knowledge
of specific features of specific animals is too sparse and noisy to be the direct
substrate of inductive generalizations about novel properties. In contrast, a
principal function of intuitive domain theories is to generalize beyond people’s
limited specific experiences, constraining the kinds of possible situations
that would be expected to occur in the world regardless of whether they
have been previously experienced (McCloskey et al., 1980; Murphy & Medin,
1985; Carey, 1985). Our framework captures this crucial function of intuitive
theories by formalizing the theory’s core principles in a generative model for
Bayesian priors.
Taken together, the performance of these three Bayesian variants shows
the importance of both aspects of our theory-based priors: a structured rep-
resentation of how categories in a domain are related and a probabilistic
model describing how properties are distributed over that relational struc-
ture. The strict-tree model incorporates an appropriate taxonomic structure
over categories but lacks a sufficiently flexible model for how properties are
distributed. The raw-feature model allows a more flexible prior distribution
for properties, but lacking a structured model of how categories are related,
it is limited to generalizing new, partially observed properties strictly based
on the examples of familiar, fully observed properties. Only the prior in the
evolutionary model embodies both of these aspects in ways that faithfully
reflect real-world biology – a taxonomic structure over species categories and
a mutation process generating the distribution of properties over that tree –
and only the evolutionary model provides consistently strong quantitative fits
to people’s inductive judgments.
Turning now to the similarity-based models, Figure 7.3 shows that their
performance varies dramatically depending on how we define the measure
of similarity to the set of premise categories. MaxSim fits reasonably well,
somewhat worse than the evolutionary model on the two Osherson datasets
but comparably to the evolutionary model on the three Smith datasets. The
fits on the Osherson datasets are worse than those reported by Osherson et al.
(1990), who used direct human judgments of similarity as the basis for the
model rather than the similarity matrix S(F ) computed from people’s feature
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
(a) (b)
human
mako
herring
tuna
mako
sandshark
dolphin
human
kelp
herring
kelp
(c) (d)
lion
wolf wolverine
squirrel
woodchuck
wolverine
wolf
fox
lion
bobcat
woodchuck bobcat squirrel fox
figure 7.4. Multiple relational structures over the same domains of species. (a) A di-
rected network structure capturing food web relations for an “island” ecosystem. (b) A
rooted ultrametric tree capturing taxonomic relations among the same species. (c) A
directed network structure capturing food web relations for a “mammals” ecosystem.
(d) A rooted ultrametric tree capturing taxonomic relations among the same species.
taxonomies are shown in Figure 7.4. Free parameters for all models (the mu-
tation rate in the evolutionary model, the background and transmission rates
for the causal transmission model) were set to values that maximized the
models’ correlations with human judgments.
We hypothesized that participants would reason very differently about dis-
ease properties and genetic properties, and that these different patterns of
reasoning could be explained by theory-based Bayesian models using appro-
priately different theories to generate their priors. Specifically, we expected
that inductive inferences about disease properties could be well approximated
by the causal-transmission model, assuming that the network for causal trans-
mission corresponded to the food web learned by participants. We expected
that inferences about genetic properties could be modeled by the evolutionary
model, assuming that the taxonomic structure over species corresponded to
the tree we recovered from participants’ judgments. We also tested MaxSim
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
Evolutionary
MaxSim
This work has benefited from the insights and contributions of many col-
leagues, in particular Tom Griffiths, Neville Sanjana, and Sean Stromsten. Liz
Bonawitz and John Coley were instrumental collaborators on the food web
studies. JBT was supported by the Paul E. Newton Career Development Chair.
CK was supported by the Albert Memorial Fellowship. We thank Bob Rehder
for helpful comments on an earlier draft of the chapter.
References
Anderson, J. R. (1990). The adaptive character of thought. Hillsdale, NJ:
Erlbaum.
Anderson, J. R.(1991). The adaptive nature of human categorization. Psycho-
logical Review, 98(3), 409–429.
Ashby, F. G., & Alfonso-Reese, L. A. (1995). Categorization as probability
density estimation. Journal of Mathematical Psychology, 39, 216–233.
Atran, S. (1995). Classifying nature across cultures. In E. E. Smith & D. N.
Osherson (Eds.), An invitation to cognitive science (Vol. 3) (pp. 131–174).
MIT Press.
Atran, S. (1998). Folkbiology and the anthropology of science: Cognitive
universals and cultural particulars. Behavioral and Brain Sciences, 21, 547–
609.
Carey, S.(1985). Conceptual change in childhood. Cambridge, MA: MIT Press.
Gelman, S., & Markman, E. (1986). Categories and induction in young chil-
dren. Cognition, 23(3), 183–209.
Goodman, N.(1972). Seven strictures on similarity. In Problems and projects.
Indiana: Bobbs-Merrill.
Gopnik, A., & Meltzoff, A. (1997). Words, thoughts, and theories. Cambridge,
MA: MIT Press.
Gopnik, A., & Schulz, L. (2004). Mechanisms of theory formation in young
children. Trends in Cognitive Sciences, 8, 371–377.
Griffiths, T. L., & Tenenbaum, J. B. (2005). Structure and strength in causal
induction. Cognitive Psychology, 51, 354–384.
Griffiths, T. L., & Tenenbaum, J. B. (2007). From mere coincidences to
meaningful discoveries. Cognition.
Griffiths, T. L., & Tenenbaum, J. B. (2006). Optimal predictions in everyday
cognition. Psychological Science, 17(9), 767–773.
Heit, E. (1998). A Bayesian analysis of some forms of inductive reasoning. In
M. Oaksford & N. Chater (Eds.), Rational models of cognition (pp. 248–274).
Oxford University Press.
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
Heit, E., & Rubinstein, J. (1994). Similarity and property effects in induc-
tive reasoning. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 20, 411–422.
Hintzman, D. L., Asher, S. J., & Stern, L. D. (1978). Incidental retrieval and
memory for coincidences. In M. M. Gruneberg, P. E. Morris, & R. N. Sykes
(Eds.), Practical aspects of memory (pp. 61–68). New York: Academic Press.
Kemp, C., Griffiths, T. L., Stromsten, S., & Tenenbaum, J. B. (2004). Semi-
supervised learning with trees. In Advances in neural processing systems 16,
pp. 257–264. Cambridge, MA: MIT Press.
Kemp, C., Perfors, A., & Tenenbaum, J. B.(2004). Learning domain structures.
In Proceedings of the 26th annual conference of the Cognitive Science Society.
Mahwah, NJ: Erlbaum.
Kemp, C., & Tenenbaum, J. B.(2003). Theory-based induction. In Proceedings
of the 25th annual conference of the Cognitive Science Society.
Kemp, C., & Tenenbaum, J. B. (submitted). Structured statistical models of
inductive reasoning.
Kruschke, J. K. (1992). Alcove: An exemplar-based connectionist model of
category learning. Psychological Review, 99, 22–44.
Markman, E.(1989). Naming and categorization in children. Cambridge, MA:
MIT Press.
Marr, D. (1982). Vision. New York: W. H. Freeman.
McCloskey, M., Caramazza, A., & Green, B. (1980). Curvilinear motion in
the absence of external forces: Naive beliefs about the motion of objects.
Science, 210(4474), 1139–1141.
McKenzie, C. R. M. (1994). The accuracy of intuitive judgment strategies:
Covariation assessment and Bayesian inference. Cognitive Psychology, 26,
209–239.
McKenzie, C. R. M., & Mikkelsen, L. A. (2000). The psychological side of
Hempel’s paradox of confirmation. Psychonomic Bulletin and Review, 7,
360–366.
Medin, D. L., Coley, J. D., Storms, G., & Hayes, B. K. (2003). A relevance
theory of induction. Psychological Bulletin and Review, 10, 517–532.
Mitchell, T. M. (1997). Machine learning. McGraw-Hill.
Murphy, G. L. (1993). Theories and concept formation. In I. V. Meche-
len, J. Hampton, R. Michalski, & P. Theuns (Eds.), Categories and con-
cepts: Theoretical views and inductive data analysis. London: Academic
Press.
Murphy, G. L., & Medin, D. L. (1985). The role of theories in conceptual
coherence. Psychological Review, 92, 289–316.
P1: JZP
0521672443c07 CUFX144-Feeney 0 521 85648 5 July 20, 2007 8:32
Steyvers, M., Tenenbaum, J. B., Wagenmakers, E. J., & Blum, B. (2003). In-
ferring causal networks from observations and interventions. Cognitive
Science, 27, 453–489.
Tenenbaum, J. B. (2000). Rules and similarity in concept learning. In S. A.
Soller, T. K. Leen, & K. R. Müller (Eds.), Advances in neural processing
systems 12 (pp. 59–65). Cambridge, MA: MIT Press.
Tenenbaum, J. B., & Griffiths, T. L. (2001). Generalization, similarity, and
Bayesian inference. Behavioral and Brain Sciences, 24, 629–641.
Tenenbaum, J. B., Griffiths, T. L., & Kemp, C. (2006). Theory-based Bayesian
models of inductive learning and reasoning. Trends in Cognitive Sciences,
10(7), 309–318.
Tenenbaum, J. B., Griffiths, T. L., & Niyogi, S. (in press). Intuitive theories as
grammars for causal inference. In A. Gopnik & L. Schulz (Eds.), Causal
learning: Psychology, philosophy, and computation. Oxford: Oxford Univer-
sity Press.
Tenenbaum, J. B., & Xu, F. (2000). Word learning as Bayesian inference. In
Proceedings of the 22nd annual conference of the Cognitive Science Society
(pp. 517–522). Hillsdale, NJ: Erlbaum.
Wellman, H. M., & Gelman, S. A. (1992). Cognitive development: Founda-
tional theories of core domains. Annual Review of Psychology, 43, 337–375.
Xu, F., & Tenenbaum, J. B.(in press). Sensitivity to sampling in Bayesian word
learning. Developmental Science.
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
Uncertainty is a basic fact of life. Despite uncertainty, people must make pre-
dictions about the world. Will the car you are considering buying be reliable?
Will you like the food you order? When you see an animal in the woods, what
should you do? One source of information that reduces uncertainty is cate-
gory membership. Although all Toyota Camrys are not exactly the same, they
are similar enough that you can predict with some confidence that the new
Camry you are considering will be reliable. Kansas City style barbecue ribs
are not identical, but they taste more similar to one another than they do to
roast chicken or “tofu surprise.” Knowing the category of an entity therefore
serves to reduce the uncertainty associated with it, and the category reduces
uncertainty to the degree that the category members are uniform with respect
to the prediction you want to make. This category-based induction is one of
the main ways that categories are useful to us in everyday life.
Unfortunately, this reduction of uncertainty is limited by the uncertainty
of category membership itself. If you go to the Toyota dealership and order
a Camry, you can be close to 100% sure that your new car is going to be a
Camry. But in many other situations, you cannot be 100% sure. Is the animal
you see in the woods a squirrel, or was your glimpse too fleeting for you to
rule out the possibility it was really a raccoon? Is the person you are dating
honest? You know how you would react to a squirrel or an honest person, but
when you are not sure if these things are in their respective categories, then
you cannot be sure of their other properties.
The central issue we address is whether (and, if so, how and when) multiple
categories are used in induction. If you are uncertain whether the animal is
a squirrel or a raccoon, do you make a different prediction than if you are
sure that the animal is a squirrel? Do you use what you know about both
categories, or only one? For items that are not uncertain, but are members of
multiple categories, the question is whether only one or a number of those
205
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
to its likelihood of being in that category (for more detail, see Anderson, 1991;
Murphy & Ross, 1994). For example, if you are trying to predict whether this
fruit will be mushy, you would estimate the likelihood that apples are mushy
and that pears are mushy (and other fruits if they are also possible) and
combine these likelihoods. Thus, this procedure entails that a category-based
induction is made only to the degree that one is certain about the category;
and if multiple categories are possible, then each is used proportionally to its
likelihood.
This model appears normatively correct to us, and it is also consistent with
later Bayesian approaches to classification and prediction (e.g., Heit, 2000;
Tenenbaum 2000). However, following the general rule that anything that is
normatively correct people do not actually do, we might well be skeptical of
this proposal. And in fact, we have found consistent evidence that people do
not in fact follow this rule, which we will summarize next.
Liz Mary
Jessica Rachel
figure 8.1. An example similar to the categories used in Murphy and Ross (1994),
illustrating the main comparison of that set of studies. In one condition, subjects would
be asked to predict the shading of a new triangle drawn by one of the children. Triangles
were most often drawn by Jessica, so Jessica was the target category. Note that most of
her drawings were black. Liz and Rachel also each drew a triangle, and they also drew
black items. Thus, information from these categories might increase one’s estimate that
a new triangle would be black (if one used multiple categories). The control question
might be to make a prediction about a new square. Rachel is most likely to have drawn a
square, and most of her figures are empty. Mary and Jessica also drew squares but have no
empty figures, and so the secondary categories do not strengthen this induction. Thus,
if people were using multiple categories, their inductions of the first example (triangle-
solid black) should be stronger than those in the second example (square-empty). But
they are not.
than one category. Similarly, Murphy and Ross (2005, Expt. 3) repeated their
experiment asking the categorization question after the induction question,
so that it would not influence the induction judgment. The pattern of results
was identical to the reverse order of questions.
that they were not certain in their initial categorization (either through rat-
ings or else by observing considerable variance in that categorization across
subjects), this uncertainty did not lower the certainty of their induction, nor
did it make them attend to other categories that the item might have been
in. One particularly striking aspect of these results is that on one question
people might admit that they are not sure what category an item is in but
on the next question they do not show any sign of this uncertainty on the
induction using this category. It is as if these two judgments take place in
compartmentalized phases. First one decides what category the object is in.
Perhaps it is not certain, but the object seems most likely to be a robin. Then
one makes an induction about the object, given that it is actually a robin.
In this second phase, the uncertainty about the initial categorization has no
effect – it is as if the uncertain categorization has been walled off and is now
treated as if it were certain.
Why is this? One possibility is that people simply cannot carry out the
computations necessary to use category uncertainty when making inductions.
It is possible that keeping track of different categories, their properties, and
their uncertainty is simply more than working memory can do. A second
possibility is that people do not recognize the principle involved. That is, they
may not realize that if you are uncertain about an object’s identity, you should
reduce the certainty of predictions about the object. This might be consistent
with findings from the decision-making literature such as the lack of using
base rates in making predictions. Just as people do not attend to the frequency
of green cabs when deciding how likely it is that a green cab was involved in
an accident (Tversky & Kahneman, 1980), they may not attend to how likely
the object is to be a robin or blackbird when making a prediction about it.
However, neither of these possibilities has survived empirical testing. We
have been able to find certain conditions in which people are sensitive to
category uncertainty, which means that they must have the computational
ability to use it in making decisions, and that they must understand the
importance of using it, at least in the abstract. Ross and Murphy (1996)
constructed a new induction question to test the limits of the earlier results.
In our original work, for cases in which the real estate agent was the primary
category, we used questions like “What is the probability that the man walking
up the driveway will pay attention to the sturdiness of the doors on the house?”
This is something that both real estate agents and burglars would do and cable
TV workers would not (according to a norming study), yet the possibility that
the person might be a burglar instead of a cable TV worker did not raise the
rated probability of this property. In the new experiment, we made questions
that were directly associated to the secondary category (a burglar, in this
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
case), for example, “What is the probability that the man walking up the
driveway will try to find out whether the windows are locked?” This activity is
directly associated with burglars, whereas cable TV workers have little interest
in windows. Our hypothesis was that in reading this question, subjects would
be reminded of the possibility that the man might be a burglar. They might
then increase their prediction (compared to the cable TV worker condition)
as a result of being reminded of this category. And this is what happened: The
predictions were 21% higher when a burglar was the secondary category than
when a cable TV worker was.
Apparently, if the induction question makes people think of the secondary
category, they can use information about that category in making their predic-
tion. Perhaps using multiple categories in that question would raise people’s
awareness of the importance of considering secondary categories in general,
and so they might use both categories to answer subsequent questions. But in
fact, that did not happen. In two experiments, we found that when people an-
swered the induction questions about properties associated to the secondary
category, they seemed to be using two categories, but on subsequent ques-
tions about the same scenario (with unassociated properties), they reverted
to using a single category (Ross & Murphy, 1996).
These results suggest the following, slightly informal explanation: If you hit
people over the head with the fact that there are two possible categories, they
can and do use both categories; if you don’t hit them over the head, then they
do not. This pattern of results shows that people do have the computational
ability to deal with two categories (at least) and that they also can recognize
the principle that both categories are relevant. But this ability and knowledge
are put to work only in fairly circumscribed situations. As will be seen, the
critical element in whether people do or do not use multiple categories seems
to be whether the secondary category is brought to their attention while they
are answering the question. If it is not, the general principle of hedging one’s
bets in uncertain cases (Anderson, 1991) does not seem strongly enough
represented to cause people to use multiple categories.
Once again, we are left with the observation that classification seems to
be a largely separate process from induction, even though they are logically
strongly connected. One possible reason for this is that in many cases, the
two are in fact done separately. We may classify entities without making an
interesting induction about them: the squirrels on the lawn as we walk to
work, the cars on the street, the trees in front of the house, the buildings, the
people, the chairs in our office, and so on. We often do not need to make
any predictions about the squirrels, trees, or chairs. And when we do make
predictions, the prediction is often not category-based (e.g., to decide how to
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
avoid something on your path, it is often more useful to know the object’s size
than its category). Furthermore, in many cases when we do make inductions,
the category is certain. If we are worried about a dog prowling around the
yard, we are likely not worried about whether it is a dog – that may be obvious.
We are worried about what it might do. If we try to predict how our doctor
will treat our condition, we do not worry about whether she is or is not really
a doctor. We do not know exactly how often people are confronted with the
situation in which something has an uncertain categorization and yet they
attempt to make a prediction about it. Although we believe that induction with
categorical uncertainty is a common event in an absolute sense, its relative
frequency may be dwarfed by thousands of occasions in which categorization
is clear or we do not draw an induction. Thus, when induction must be
done when categories are uncertain, people may carry over habits from their
experiences with certain categories.
We may be a bit better at dealing with uncertainty when we know almost
nothing about the entity’s categorization. For example, if we stop someone on
the street to ask a question, we can have only a very vague idea of the person’s
profession or religion or hobbies. Under such circumstances, people take
into account the uncertainty. In an experiment using the geometric figure
categories, we placed one triangle (say) in all four categories and then asked
people to make predictions about a new triangle. We obtained evidence that
people used all four categories in this situation (see Murphy & Ross, 1994, for
the details). We assume that the reason for this is that people have no preferred
category in this situation and so cannot engage in the two-stage process we
have described. That is, they could literally see that the triangle was equally
likely to be in each of four categories and thus had no basis for picking one
of the categories, and so they considered all four of them. However, this
situation would be quite rare in real life, and we suspect that when there is
only very weak evidence for a number of categories, people simply do not
engage in category-based induction. That is, even if you believed profession is
a good predictor of helpfulness, if you cannot tell what profession a stranger
is likely to be in, you may not make a guess about his profession to predict his
willingness to help you. Instead, you will simply not make the induction or
will try to find another basis on which to make it.
However, even in situations where there is no clear reason that a category-
based induction would help, people may use a single category that is brought
to mind. Lagnado and Shanks (2004) described an interesting situation like
this, using hierarchically organized categories of persons. They had two super-
ordinate categories of readers of different types of newspapers, which we will
just call A and B, each with two subcategories of particular newspapers. There
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
action when the task is easy or the application is obvious, but they do not
follow the principle in more difficult conditions (e.g., Baillargeon, 1998). In
our case, when the induction property itself brings up the alternative category
(and when the subject has already identified the other category as most likely),
both categories are together in working memory, and the person may realize
the importance of using both of them. But when the question does not
bring to mind an alternative category, only the selected category is in working
memory, and the principle itself is not accessed. Presumably, people who make
predictions for a living, such as weather forecasters, poker players, and the
like, have more practice in dealing with their uncertainty and may be better at
answering these questions. (Note to granting agencies: Anyone who would like
to fund a case study of induction in poker playing, please contact the authors.)
-
Multiple categories are important when an object’s identity is uncertain. But
multiple categories may be important in a different situation as well, namely,
when an object has multiple “identities.” We are not referring to a questionable
personality disorder but to the fact that categories are cross-cutting. Rather
than a single hierarchy in which every entity fits into exactly one category at
a given level of specificity, categories are a hodge-podge of groupings with
different bases, overlapping to various degrees. For example, within categories
pertaining to people, there are classifications based on personality, size, eth-
nicity, profession, gender, sexual orientation, areas of interest and expertise,
age, educational level, financial status, and so on. These categories are not
all independent, but their bases may be quite different (e.g., gender vs. pro-
fession), and so they change the way that one thinks about the person being
categorized. A large, outgoing forty-two-year-old Asian-American female ac-
countant who is married, likes sewing and football, has a B.A. and M.B.A, and
is upper-middle class is a member of a few dozen person categories. If you have
to make a prediction about this person, how will you decide which category
to use? Will you use just one? If so, how do you choose which category to use?
Or will you use some or all of these categories? Research in social psychology
has addressed this problem specifically for social categories and stereotypes
(e.g., Macrae, Bodenhausen, & Milne, 1995; Nelson & Miller, 1995; Patalano,
Chin-Parker, & Ross, 2006), and we have investigated it for object categories.
Cross-classification raises issues that are very analogous to those we con-
sidered in the multiple-categorization problems above. However, here the
problem is not of uncertainty but of multiplicity. That is, we are not uncer-
tain whether the person is female or an accountant – we know she is both.
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
Nonetheless, that does not mean that people will use both categories in making
a prediction. If one category is more relevant than the others, then people
tend to use that as the basis for prediction. For example, Heit and Rubinstein
(1994) showed that when making inductions about a biological property from
one animal to another, people had more confidence when the two animals
were biologically related (e.g., were both mammals). However, when making
inductions about a behavioral property, they were more confident when the
two animals shared an ecological niche (e.g., were both ocean dwellers; also
see Ross & Murphy, 1999). This selective use of relevant categories is not
limited to adults. Kalish and Gelman (1992) found that preschool children
made inductions about the physical properties of a new object based on its
physical make-up (e.g., being made of glass) but made functional inductions
based on the object’s artifact category (e.g., being a knife) (though see Nguyen
& Murphy, 2003).
This research shows that people make inductions at least partly based on
their beliefs about whether the category supports induction of the property
(see also Proffitt, Medin, & Coley, 2000). Thus, these beliefs provide one way
of narrowing down the many categories that an object or person might belong
to. Although Cheri may be a mother, this does not tell you what her highest
educational degree is or whether she is wealthy. Thus, categories like mother
may not be used in making these inductions about Cheri, whereas profession
might be used. What is not so clear from these studies is what people do when
multiple categories are relevant and potentially useful. The above studies were
specifically designed so that different categories were differentially relevant,
and so they could not answer this question. Therefore, we addressed this
issue in a manner analogous to our previous work on multiple-category
induction.
The problem with studying cross-classification is that we cannot as eas-
ily manipulate the categories experimentally, since learning complex cross-
cutting categories would be rather difficult. Instead, we used real-world cat-
egories of food items, because we had already found that foods are very
cross-classified (Ross & Murphy, 1999). For example, a bagel is a bread and
a breakfast food. Because we were interested in induction, we created a novel
predicate for each problem that we attributed to different categories (as in
classic work on category-based induction such as Osherson, Smith, Wilkie,
López, & Shafir, 1990; Rips, 1975). We showed subjects a table of different
categories with these properties in them, which might include information
such as breads contain chloride salts 55% of the time, and breakfast foods
contain chloride salts 25% of the time. These were only two of the facts listed,
so that subjects would not know which predicates were of special interest.
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
Then we asked the subjects to judge (among other things) what percentage of
time a bagel had chloride salts. The question was whether they would use one
of the categories (breads or breakfast foods) or both in arriving at an answer.
We found (Murphy & Ross, 1999, Expt. 1) that the majority of subjects used
only one category, but that a minority consistently used both categories in
their induction. Thus, as in our earlier work on uncertain categorizations, it
appeared that people had the ability to combine information from different
categories but did not generally see the necessity of doing so. The choice to use
multiple cross-cutting categories appeared to be a fairly high-level strategic
decision.
Unfortunately, the story is a bit more complex. In later experiments, we
simplified the display, presenting only the critical two categories (e.g., bread
and breakfast food in the bagel example) and asking fewer questions. With
these changes, the majority of people used both categories. Furthermore,
when we constructed a version of this procedure using ambiguous categories
(like the real estate agent/burglar cases considered in our previous work),
we found that people again used multiple categories most of the time (see
Murphy & Ross, 1999, for details).
These results complicate the picture slightly, but they can be explained in
terms of the conclusions drawn earlier. In this series of studies, we presented
the critical categories along with information about predicates (e.g., how of-
ten items in the category have chloride salts). This procedure had the effect
of bringing to people’s attention exactly the relevant categories and predicate.
That is, the presentation format focused people on the information relevant
to making the induction, removing other information that was not relevant,
and so they accessed both categories. When the task did not focus people’s
attention on the critical categories (as in Experiment 1 of the series, when the
critical categories were just two of four presented, and more questions were
asked), they generally used only one category. Thus, these results are theo-
retically consistent with the very different manipulation of using a property
associated to the category (as in the burglar–locked window example) – both
serve the function of bringing an alternative category to the person’s atten-
tion. Surprisingly (to us), we found almost identical results in this paradigm
for the cross-cutting categories (like breakfast food/bagel) and uncertain cat-
egories. Thus, these results suggest that in cases where it is crucial to attend
to different possible categories (e.g., medical decision making with uncertain
diagnosis), explicitly presenting the categories and the relevant information
may encourage people to make more accurate inductions.
Further work needs to be done on induction of cross-cutting categories. It
is clear that people can choose which category is relevant when only one of
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
them is related to the induction property (as in Heit & Rubinstein, 1994). It
is also clear that people can and will use two cross-cutting categories when
it is obvious that both are relevant, and the property information is easy to
retrieve (as in Murphy & Ross, 1999). What is not so clear is what people do in
more naturalistic situations in which the categories and induction feature are
not explicitly presented. Unfortunately, this situation is exactly the one that is
hardest to study, because it is difficult to identify and control the information
known about each category. That is, to show use of multiple categories, we
need to know the person’s beliefs about each category so that we can tell
whether he or she is relying on one or both of them. In the ambiguous-
category case, we could control the categories by constructing scenarios in
which different categories were mentioned. But in cross-classification, the
item itself determines its categories, and we cannot make a bagel a dinner
entree versus a breakfast food in different conditions. So, this topic is awaiting
a clever idea for how to investigate it.
the first would be more heavily weighted than the second, because it is more
specific. Tenenbaum (2000) reports a related experiment.
Tenenbaum found that both of these experiments were fit well by a Bayesian
rule that considered each hypothesis, weighted by its likelihood. Assuming
that the number ranges are like categories and the given data are like exemplars,
his results suggest that people do after all use multiple categories in making
predictions, just as Anderson (1991) said they should, contrary to our results
in analogous situations.
What are we to make of this apparent contradiction? There are a number of
differences between Tenenbaum’s problems and the ones we have used in our
experiments, and it is likely that many of them contribute to the (apparent)
difference in results. One major difference is that our paradigm has focused
on fairly fixed object categories, whereas the number ranges of Tenenbaum
(1999) are rather ad hoc categories that do not have any independent existence.
That is, the range 46 to 55 is considered only because it happens to include all
the presented data points and not because there is any special coherence or
similarity to this set of numbers. In contrast, our geometric shape categories
were supposedly drawn by different children, were spatially separated, and
had some family resemblance. Our natural categories, such as real estate agent
or breakfast foods, were familiar categories that people use in everyday life.
It is possible that only such entrenched categories show the bias for single-
category use that we have found in most experiments.
Furthermore, it is possible that Tenenbaum’s subjects are not considering
each range of categories as an explicit category but instead are using a similarity
metric comparing the novel data point to the presented data. Certainly, in
doing this task, it does not seem that subjects are consciously considering “Is
it 43 to 55 or 46 to 56 or 44 to 56 or 46 to 55 or. . . .?”(at least, we don’t, when
we try this task). However, in our experiments with a very small number
of possible categories, we suspect that people are consciously choosing the
category the item is in. Then, because we require them to provide an induction
strength judgment, they must give a numerical response. This response may
further encourage the use of a single category, since subjects must come up
with an actual number, and that becomes increasingly difficult to do when
the number of possible categories increases, each with different information
in it. Although we do not have direct evidence that these factors account for
the apparent differences in results, we do have some recent evidence that use
of multiple categories depends on the experimental task, including the type
of decision being made.
In a recent study with Michael Verde (Verde, Murphy, & Ross, 2005), we
compared our past paradigm in which subjects provide probabilities as a
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
control your induction even if other brown, fast-moving stimuli have had
different properties (e.g., you have had bad experiences with coyotes sneaking
up on you).
In short, in some cases, induction may be based not on category-level in-
formation at all but instead on remembered exemplars. Such inductions have
many of the properties of inductions that Bayesian models such as Tenen-
baum’s and Anderson’s have. That is, they would “weight” information from
the most likely category most of all (because most of the exemplars would
come from that category), and information from other categories would
be weighted proportionally to their likelihood (based on the number of re-
membered exemplars from each category). But the induction itself might
be somewhat different from the one envisioned in Anderson’s model, where
category-level information is the basis of induction. That is, if you identify the
brown animal as a deer, then information about deer in general would be used
in Anderson’s model. In the exemplar-based approach we are considering, it
might be only the properties of the particular remembered exemplars that af-
fect the decision, and if the remembered exemplars are not very typical of their
categories, then the prediction would deviate from a categorical induction.
A third possibility is that people might use both category and exemplar
information in making the prediction. The activated properties provide some
exemplar-based information that might combine with the category-level in-
formation, depending on the circumstances (time pressure, complexity of
prediction, etc.). One can think of this combination as a type of special-
ization, where its usefulness depends on the consistency of category and
exemplar-level information. It allows the greater statistical basis of the cate-
gory to influence the decision but also allows sensitivity to the specifics of the
item about which the prediction is made (see, e.g., Brooks, 1987; Medin &
Ross, 1989; Murphy, 2002, for specificity effects). As Murphy and Ross (2005)
argued, even when one makes category-based inductions, properties of the
specific item involved may matter as well.
So far as we know, there has been no attempt to distinguish the exemplar-
based and category-based induction processes for coherent categories. One
reason for this is that in the classic category-based induction work, subjects are
presented with entire categories and/or an undescribed exemplar. Consider
the example, “If sparrows have sesamoid bones, what is the probability that
eagles have sesamoid bones?” Here, only entire categories are mentioned. The
questions asked by Rips (1975), Osherson et al. (1990), and the more recent
work of Proffitt et al. (2000) all focus on category-level problems.
An important topic for future research, then, is to investigate how item-level
and category-level information is used and integrated in making predictions
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
(Murphy & Ross, 2005). If you believe that dogs get sick from eating chocolate,
will you change your prediction for a particularly robust, voracious dog who
is known to eat any semi-edible garbage that it can grab? How about a yappy,
pampered toy dog of weak disposition? Proffitt et al. (2000) make a strong
argument that inductions are based on causal knowledge when it is available.
The usual use of blank predicates is an attempt to eliminate such knowledge
as a factor. But in everyday life, people do have beliefs about the predicates
they make inductions about, and they also have knowledge about the specific
individuals they are interested in (e.g., they can see the dog, they know their
doctor, they have ridden in this car, and so on). The items are not unknown
objects described only by their category name: “a dog,” “a doctor,” or “a car.”
So, to expand our understanding of induction further, we need to begin to
address the interaction between category-based induction and individual-
based induction.
In summary, we would often be better off if we considered multiple categories
when making inductions about entities. When categorization is uncertain,
our inductions ought to take our uncertainty into account. However, in a
number of paradigms manipulating a number of variables, we have found
that people generally do not spontaneously consider multiple categories when
making inductions. Perhaps surprisingly, this does not appear to be due to
unfamiliarity with the principle involved, as shortcomings in decision making
often seem to be. Nor is the problem that people are not able to combine the
predictions drawn from different categories. Instead, the problem appears to
be a kind of production deficit in which the necessity of considering multiple
categories does not occur to people except in situations that strongly bring
the secondary categories to attention. When the property is associated to the
secondary category, or when that category is presented in a way that is difficult
to ignore (Murphy & Ross, 1999), people integrate information from multiple
categories.
On the negative side, people’s reluctance to spontaneously combine predic-
tions from categories suggests that they will often make suboptimal predic-
tions. But on the positive side, when the necessity for using multiple categories
is brought to the fore, people are willing and able to use them. Thus, our results
give some hope for improvement in category-based inductions in situations
where those inductions are important. By presenting the secondary category
prominently or asking just the right question, people can be encouraged to
take multiple categories into account.
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
Finally, our most recent work suggests the possibility that performance in
tasks that we think of as category-based induction may not necessarily be
based on categories. Performance that seems to involve the use of multiple
categories might be accomplished if people skip the categories entirely, relying
on remembered exemplars. Although this is speculative, we see the possibility
that people make use of both categories and exemplar information when
making inductions to be an interesting possibility that also relates research
on category-based induction to current work on classification.
Please address all correspondence to Gregory L. Murphy, Department of
Psychology, New York University, 6 Washington Place, 8th floor, New York,
NY 10003; or email correspondence to [email protected].
References
Anderson, J. R. (1991). The adaptive nature of human categorization. Psychological
Review, 98, 409–429.
Baillargeon, R. (1998). Infants’ understanding of the physical world. In M. Sabourin,
F. Craik, & M. Robert (Eds.), Advances in psychological science, Vol. 2: Biological and
cognitive aspects (pp. 503–529). London: Psychology Press.
Brooks, L. R. (1987). Decentralized control of categorization: The role of prior processing
episodes. In U. Neisser (Ed.), Concepts and conceptual development: Ecological and
intellectual factors in categorization (pp. 141–174). Cambridge: Cambridge University
Press.
Heit, E. (2000). Properties of inductive reasoning. Psychonomic Bulletin & Review, 7,
569–592.
Heit, E., & Rubinstein, J. (1994). Similarity and property effects in inductive reasoning.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 411–422.
Kalish, C. W., & Gelman, S. A. (1992). On wooden pillows: Multiple classification and
children’s category-based inductions. Child Development, 63, 1536–1557.
Lagnado, D. A., & Shanks, D. R. (2003). The influence of hierarchy on probability
judgments. Cognition, 89, 157–178.
Macrae, C. N., Bodenhausen, G. V., & Milne, A. B. (1995). The dissection of selection in
person perception: Inhibitory processes in social stereotyping. Journal of Personality
and Social Psychology, 69, 397–407.
Malt, B. C., Ross, B. H., & Murphy, G. L. (1995). Predicting features for members of nat-
ural categories when categorization is uncertain. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 21, 646–661.
Medin, D. L., & Ross, B. H. (1989). The specific character of abstract thought: Cate-
gorization, problem solving, and induction. In R. J. Sternberg (Ed.), Advances in the
psychology of human intelligence, Vol. 5. Hillsdale, NJ: Erlbaum.
Murphy, G. L. (2002). The big book of concepts. Cambridge, MA: MIT Press.
P1: JZP
0521672443c08 CUFX144-Feeney 0 521 85648 5 July 20, 2007 6:56
Murphy, G. L., & Ross, B. H. (1994). Predictions from uncertain categorizations. Cogni-
tive Psychology, 27, 148–193.
Murphy, G. L., & Ross, B. H. (1999). Induction with cross-classified categories. Memory
& Cognition, 27, 1024–1041.
Murphy, G. L., & Ross, B. H. (2005). The two faces of typicality in category-based
induction. Cognition, 95, 175–200.
Nelson, L. J., & Miller, D. T. (1995). The distinctiveness effect in social categorization:
You are what makes you unusual. Psychological Science, 6, 246–249.
Nguyen, S. P., & Murphy, G. L. (2003). An apple is more than just a fruit: Cross-
classification in children’s concepts. Child Development, 74, 1783–1806.
Osherson, D. N., Smith, E. E., Wilkie, O., López, A., & Shafir, E. (1990). Category-based
induction. Psychological Review, 97, 185–200.
Patalano, A. L., Chin-Parker, S., & Ross, B. H. (2006). The importance of being coher-
ent: Category coherence, cross-classification, and reasoning. Journal of Memory and
Language, 54, 407–424.
Proffitt, J. B., Coley, J. D., & Medin, D. L. (2000). Expertise and category-based induction.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 811–828.
Rips, L. J. (1975). Inductive judgments about natural categories. Journal of Verbal Learn-
ing and Verbal Behavior, 14, 665–681.
Ross, B. H., & Murphy, G. L. (1996). Category-based predictions: Influence of uncertainty
and feature associations. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 22, 736–753.
Ross, B. H., & Murphy, G. L. (1999). Food for thought: Cross-classification and category
organization in a complex real-world domain. Cognitive Psychology, 38, 495–553.
Tenenbaum, J. B. (1999). Bayesian modeling of human concept learning. Advances in
Neural Information Processing Systems, 11, 59–68.
Tenenbaum, J. B. (2000). Rules and similarity in concept learning. Advances in Neural
Information Processing Systems, 12, 59–65.
Tversky, A., & Kahneman, D. (1980). Causal schemas in judgments under uncertainty.
In M. Fishbein (Ed.), Progress in social psychology (pp. 49–72). Hillsdale, NJ: Erlbaum.
Verde, M. F., Murphy, G. L., & Ross, B. H. (2005). Influence of multiple categories on
property prediction. Memory & Cognition, 33, 479–487.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
1.
In the 1890s, the great American philosopher C. S. Peirce (1931–1958) used
the term “abduction” to refer to a kind of inference that involves the gen-
eration and evaluation of explanatory hypotheses. This term is much less
familiar today than “deduction,” which applies to inference from premises
to a conclusion that has to be true if the premises are true. And it is much
less familiar than “induction,” which sometimes refers broadly to any kind
of inference that introduces uncertainty, and sometimes refers narrowly to
inference from examples to rules, which I will call “inductive generalization.”
Abduction is clearly a kind of induction in the broad sense, in that the gener-
ation of explanatory hypotheses is fraught with uncertainty. For example, if
the sky suddenly turns dark outside my window, I may hypothesize that there
is a solar eclipse, but many other explanations are possible, such as the arrival
of an intense storm or even a huge spaceship.
Despite its inherent riskiness, abductive inference is an essential part of
human mental life. When scientists produce theories that explain their data,
they are engaging in abductive inference. For example, psychological theories
about mental representations and processing are the result of abductions
spurred by the need to explain the results of psychological experiments. In
everyday life, abductive inference is ubiquitous, for example when people
generate hypotheses to explain the behavior of others, as when I infer that
my son is in a bad mood to explain a curt response to a question. Detectives
perform abductions routinely in order to make sense of the evidence left by
criminal activity, just as automobile mechanics try to figure out what problems
are responsible for a breakdown. Physicians practice abduction when they try
to figure out what diseases might explain a patient’s symptoms. Table 9.1
summarizes the kinds of abductive inference that occur in various domains,
226
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
Targets to be Explanatory
Domains Explained Hypotheses
science experimental results theories about structures and processes
medicine symptoms diseases
crime evidence culprits, motives
machines operation, breakdowns parts, interactions, flaws
social behavior mental states, traits
involving both targets that require explanation and the hypotheses that are
generated to explain them. Abduction occurs in many other domains as well,
for example, religion, where people hypothesize the existence of God in order
to explain the design and existence of the world.
The next section will briefly review the history of the investigation of
abduction by philosophers and artificial intelligence researchers, and discuss
its relative neglect by psychologists. First, however, I want to examine the
nature of abduction and sketch what would be required for a full psychological
theory of it. I then outline a neurocomputational theory of abductive inference
that provides an account of some of the neural processes that enable minds to
make abductive inference. Finally, I discuss the more general implications of
replacing logic-based philosophical analyses of human inference with theories
of neural mechanisms.
Here are the typical stages in the mental process of abduction. First, we no-
tice something puzzling that prompts us to generate an explanation. It would
be pointless to waste mental resources on something ordinary or expected.
For example, when my friends greet me with the normal “Hi,” I do not react
like the proverbial psychoanalyst who wondered “What can they mean by
that?” In contrast, if a normally convivial friend responds to “Good morn-
ing” with “What’s so good about it?”, I will be prompted to wonder what is
currently going on in my friend’s life that might explain this negativity. Peirce
noticed that abduction begins with puzzlement, but subsequent philosophers
have ignored the fact that the initiation of this kind of inference is inherently
emotional. Intense reactions such as surprise and astonishment are partic-
ularly strong spurs to abductive inference. Hence the emotional initiation
of abductive inference needs to be part of any psychological or neurological
theory of how it works. An event or general occurrence becomes a target for
explanation only when it is sufficiently interesting and baffling. I know of no
general experimental evidence for this claim, but Kunda, Miller, and Claire
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
228 P. Thagard
pleasure or
satisfaction
principle modus ponens, is actually a far richer and more powerful form of
thinking.
230 P. Thagard
been concerned with medical reasoning. For example, the RED system takes
as input descriptions of cells and generates and evaluates hypotheses about
clinically significant antibodies found in the cells (Josephson and Josephson,
1994). More recently, discussions of causal reasoning in terms of abduction
have been eclipsed by discussions of Bayesian networks based on probability
theory, but later I will describe limitations of purely probabilistic accounts
of causality, explanation, and abduction. Abduction has also been a topic of
interest for researchers in logic programming (Flach and Kakas, 2000), but
there are severe limitations to a characterization of abduction in terms of
formal logic (Thagard and Shelley, 1997).
Some AI researchers have discussed the problem of generating explanatory
hypotheses without using the term “abduction.” The computational models
of scientific discovery described by Langley, Simon, Bradshaw, and Zytkow
(1987) are primarily concerned with the inductive generalization of laws from
data, but they also discuss the generation of explanatory structure models in
chemistry. Langley et al. (2004) describe an algorithm for “inducing explana-
tory process models,” but it is clear that their computational procedures for
constructing models of biological mechanisms operate abductively rather
than via inductive generalization.
Psychologists rarely use the terms “abduction” or “abductive inference,”
and very little experimental research has been done on the generation and
acceptance of explanatory hypotheses. Much of the psychological literature
on induction concerns a rather esoteric pattern of reasoning, categorical
induction, in which people express a degree of confidence that a category has
a predicate after being told that a related category has the predicate (Sloman
and Lagnado, 2005). Here is an example:
Tigers have 38 chromosomes.
Do buffaloes have 38 chromosomes?
Another line of research involves inductive generalizations about the behavior
of physical devices (Klahr, 2000). Dunbar (1997) has discussed the role of
analogy and other kinds of reasoning in scientific thinking in real-world
laboratories. Considerable research has investigated ways in which people’s
inductive inferences deviate from normative standards of probability theory
(Gilovich, Griffin, and Kahneman, 2002).
Experimental research concerning causality has been concerned with top-
ics different from the generation of causal explanations, such as how people
distinguish genuine causes from spurious ones (Lien and Cheng, 2000) and
how knowledge of the causal structure of categories supports the ability to
infer the presence of unobserved features (Rehder and Burnett, 2005). Social
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
3.
The structure of abduction is roughly this:
232 P. Thagard
234 P. Thagard
neural neural
structure structure
(target) (hypothesis)
neural neural
structure structure
(puzzled) (satisfied)
236 P. Thagard
6 months, infants believe that the distance traveled by the stationery object is
proportional to the size of the moving object. Thus at a very primitive stage of
verbal development children seem to have some understanding of causality
based on their visual and tactile experiences. According to Mandler (2004),
infants’ very early ability to perceive causal relations need not be innate but
could arise from a more general ability to extract meaning from perceptual
relationships. Whether or not it is innate, infants clearly have an ability to
extract causal information that develops long before any verbal ability.
Recent work using functional magnetic resonance imaging has investigated
brain mechanisms underlying perceptual causality (Fugelsang et al., 2005).
Participants imaged while viewing causal events had higher levels of relative
activation in the right middle frontal gyrus and the right inferior parietal lob-
ule compared to those viewing non-causal events. The evidence that specific
brain structures are involved in extracting causal structure from the world
fits well with cognitive and developmental evidence that adults and children
are able to perceive causal relations, without making inferences based on uni-
versality, probability, or causal powers. It is therefore plausible that people’s
intuitive grasp of causality, which enables them to understand the distinc-
tion between causal relations and mere co-occurrence, arises very early from
perceptual experience. Of course, as people acquire more knowledge, they
are able to expand this understanding of causality far beyond perception,
enabling them to infer that invisible germs cause disease symptoms. But this
extended understanding of causality is still based on the perceptual experience
of one event making another happen, and does not depend on a mysterious,
metaphysical conception of objects possessing causal powers. For discussion
of the role of causality in induction, see Bob Rehder’s Chapter 4 in this
volume.
Now we can start to flesh out in neurological terms what constitutes the re-
lation between a target and an explanatory hypothesis. Mandler (2004) argues
that CAUSED-MOTION is an image schema, an abstract, non-propositional,
spatial representation that expresses primitive meanings. Lakoff (1987) and
others have proposed that such non-verbal representations are the basis for
language and other forms of cognition. Feldman and Narayan (2004) have de-
scribed how image schemas can be implemented in artificial neural systems. I
will assume that there is a neurally encoded image schema that establishes the
required causal relation that ties together the neural structure of a hypothesis
and the neural structure of the target that it explains. We would then have
a neural representation of the explanatory, causal relation between hypothe-
ses and targets. This relation provides the abductive basis for the inferential
process described in the next section.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
The model of abductive inference sketched in Figures 9.1 and 9.2 has
been implemented in a computer simulation that shows in detail how neural
processes can generate emotional initiation and causal reasoning (Thagard
and Litt, forthcoming). The details are too technical to present here, but the
simulation is important because it shows how causal and emotional informa-
tion distributed over thousands of artificial neurons can produce a simple
form of abductive inference.
5.
On the standard philosophical view, inference is the movement from one
or more propositions taken to be true to another proposition that follows
from them deductively or inductively. Here a proposition is assumed to be an
abstract entity, namely, the meaning content of a sentence. Belief and other
mental states such as doubt, desire, and fear are all propositional attitudes,
that is, relations between persons and propositions. An inference is much like
an argument, which is the verbal description of a set of sentential premises
that provide the basis for accepting a conclusion. Most philosophical and
computational accounts of abductive inference have assumed this kind of
linguistic picture of belief and inference.
There are many problems with this view. It postulates the existence of
an infinite number of propositions, including an infinite number that will
never be expressed by any uttered sentence. These are abstract entities whose
existence is utterly mysterious. Just as mysterious is the relation between
persons and propositions, for what is the connection between a person’s body
or brain and such abstract entities? The notion of a proposition dates back
at least to Renaissance times when almost everyone assumed that persons
were essentially non-corporeal souls, which could have some non-material
relation to abstract propositions. But the current ascendancy of investigation
of mental states and operations in terms of brain structures and processes
makes talk of abstract propositions as antiquated as theories about souls or
disease-causing humors. Moreover, philosophical theories of propositional
belief have generated large numbers of insoluble puzzles, such as how it can
be that a person can believe that Lewis Carroll wrote Alice in Wonderland,
but not that Charles Dodgson did, when the beliefs seem to have the same
content because Carroll and Dodgson are the same person.
Implicit in my account of abductive inference is a radically different ac-
count of belief that abandons the mysterious notion of a proposition in favor
of biologically realistic ideas about neural structures. In short, beliefs are
neural structures consisting of neurons, connections, and spiking behavior;
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
238 P. Thagard
and so are all the other mental states that philosophers have characterized as
propositional attitudes, including doubt, desire, and fear. This view does away
with the metaphysical notion of a proposition, but does not eliminate stan-
dard mental concepts such as belief and desire, which are, however, radically
reconstrued in terms of structures and processes in the brain (cf. Churchland,
1989).
This view of mental operations makes possible an account of the nature of
inference that is dramatically different from the standard account that takes
inference to operate on propositions the same way that argument operates
on sentences. First, it allows for non-verbal representations from all sensory
modalities to be involved in inference. Second, it allows inferences to be
holistic in ways that arguments are not, in that they can simultaneously take
into account a large amount of information before producing a conclusion.
How this works computationally is shown by connectionist computational
models such as my ECHO model of explanatory coherence (Thagard 1992).
ECHO is not nearly as neurologically realistic as the current context requires,
since it uses localist artificial neurons very different from the groups of spiking
neurons that I have been discussing, but it at least shows how parallel activity
can lead to holistic conclusions.
What then is inference? Most generally, inference is a kind of transforma-
tion of neural structures, but obviously not all such transformations count
as inference. We need to isolate a subclass of neural structures that are rep-
resentations, that is, ones that stand for something real or imagined in the
world. Roughly, a neural structure is a representation if its connections and
spiking behavior enable it to relate to perceptual input and/or the behavior
of other neural structures in such a way that it can be construed as standing
for something else such as a thing or concept. This is a bit vague, but it is
broad enough to cover both cases where neural structures stand for concrete
things in the world, for example, George W. Bush, and for general categories
that may or may not have any members, for example, unicorns. Note that
one neural structure may constitute many representations, because different
spiking behaviors may correspond to different things. This capability is a
feature of all distributed representations.
Accordingly, we can characterize inference as the transformation of represen-
tational neural structures. Inference involving sentences is a special case of such
transformation where the relevant neural structures correspond to sentences.
My broader account has the great advantage of allowing thinking that uses
visual and other sensory representations to count as inference as well. From
this perspective, abduction is the transformation of representational neural
structures that produces neural structures that provide causal explanations.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
240 P. Thagard
6.
I described earlier how abductive inference is initiated by emotional reactions
such as surprise and puzzlement, but other forms of inference also have
affective origins. The influence of emotions on decision making has often
been noted (Damasio, 1994; Mellers et al., 1999; Thagard, 2001, 2006). But
less attention has been paid to the fact that inferences about what to do
are usually initiated by either positive or negative emotions. Decisions are
sometimes prompted by negative emotions such as fear: if I am afraid that
something bad will happen, I may be spurred to decide what to do about it.
For example, a person who is worried about being fired may decide to look for
other jobs. It would be easy to generate examples of cases where other negative
emotions such as anger, sadness, envy, and guilt lead people to begin a process
of deliberation that leads to a practical inference. More positively, emotions
such as happiness can lead to decisions, as when someone thinks about how
much fun it would be to have a winter vacation and begins to collect travel
information that will produce a decision about where to go. Just as people do
not make abductive inferences unless there is some emotional reason for them
to look for explanations, people do not make inference about what to do unless
negative or positive emotional reactions to their current situation indicate
that action is required. In some cases, the emotions may be tied to specific
perceptual states, such as when hunger initiates a decision about what to eat.
Deductive inference might be thought to be impervious to emotional influ-
ences, but there is neurological evidence that even deduction can be influenced
by brain areas such as the ventromedial prefrontal cortex that are known to
be involved in emotion (Houdé et al., 2001; Goel and Dolan, 2003). It is hard
to say whether deduction is initiated by emotion, because I think it is rarely
initiated at all outside the context of mathematical reasoning: readers should
ask themselves when was the last time they made a deductive inference. But
perhaps deduction is sometimes initiated by puzzlement, as when one won-
ders whether an object has a property and then retrieves from memory a rule
that says that all objects of this type have the property in question. This kind
of inference may be so automatic, however, that we never become aware of
the making of the inference or any emotional content of it.
Analogical inference often involves emotional content, especially when it
is used in persuasion to transfer negative affect from a source to a target
(Blanchette and Dunbar, 2001; Thagard and Shelley, 2001). For example,
comparing a political leader to Hitler is a common way of motivating people to
dislike the leader. Such persuasive analogies are often motivated by emotional
reactions such as dislike of a person or policy. Because I dislike the leader, I
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
compare him or her to Hitler in order to lead you to dislike the leader also.
Practical analogical inferences are prompted by emotions in the same way
that other decisions are: I want to go on vacation, and remember I had a
good time at a resort before, and decide to go to a similar resort. Analogical
abductions in which an explanatory hypothesis is formed by analogy to a
previous explanation are prompted by the same emotional reactions (surprise,
puzzlement) as other abductive inferences.
Are inductive generalizations initiated by emotional reactions? At the least,
emotion serves to focus on what is worth the mental effort to think about
enough to form a generalization. As a social example, if I have no interest in
Albanians, I will probably not bother to form a stereotype that generalizes
about them, whereas if I strongly like or dislike them I will be much more
inclined to generalize about their positive or negative features. I conjecture
that most inductive generalizations occur when there is some emotion-related
interest in the category about which a rule is formed.
It is unfortunate that no one has collected a corpus that records the kinds of
inferences that ordinary people make every day. I conjecture that such a corpus
would reveal that most people make a large number of practical inferences
when decisions are required, but a relatively small number of inductive and
deductive inferences. I predict that deduction is very rare unless people are
engaged in mathematical work, and that inductive inferences are not very
frequent either. A carefully collected corpus would display, I think, only the
occasional inductive generalization or analogical inference, and almost none
of the categorical inductions studied by many experimental psychologists.
Abductive inferences generating causal explanations of puzzling occurrences
would be more common, I conjecture, but not nearly as common as practical
inferences generating decisions. If the inference corpus also recorded the
situations that prompt the making of inferences, it would also provide the
basis for testing my claim that most inference, including practical, abductive,
inductive, analogical, and deductive, is initiated by emotions. For further
discussion of the relation between deduction and induction, see Chapters 10
and 11 in this volume by Rips and Asmuth and by Oaksford and Hahn.
7.
Some psychological research on inductive inference has pointed to the ten-
dency of people to assume the presence of underlying mechanisms associated
with categories of things in the world (Rehder and Burnett, 2005; Ahn, Kalish,
Medin, and Gelman, 1995). Psychologists have had little to say about what
mechanisms are, or how people use representations of mechanisms in their
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
242 P. Thagard
Mechanism-based abduction differs from the simple sort in that people mak-
ing inferences can rely on a whole collection of causal relations among the
relevant objects, not just a particular causal relation.
So far, I have been discussing abduction from mechanisms, in which rep-
resentations of a mechanism are used to suggest an explanatory hypothesis
about what is happening to the objects in it. But abductive inference is even
more important for generating knowledge about how the mechanism works,
especially in cases where its operation is not fully observable. With a bicycle,
I can look at the pedals and figure out how they move the chains and wheels,
but much of scientific theorizing consists of generating new ideas about unob-
servable mechanisms. For example, medical researchers develop mechanistic
models of how the metabolic system works in order to explain the origins
of diseases such as diabetes. Often, such theorizing requires postulation of
objects, relations, and changes that are not directly observed. In such cases,
knowledge about mechanisms cannot be obtained by inductive generalization
of the sort that works with bicycles, but depends on abductive inference in
which causal patterns are hypothesized rather than observed. This kind of
abduction to mechanisms is obviously much more difficult and creative than
abduction from already understood mechanisms. Often it involves analogical
inference in which a mechanism is constructed by comparing the target to be
explained to another similar target for which a mechanism is already under-
stood. In this case, a mechanism <objects, relations, changes> is constructed
by mapping from a similar one. For more on analogical discovery, see Holyoak
and Thagard (1995, ch. 8).
In order to show in more detail how abductive inference can be both to and
from mechanisms, it would be desirable to apply the neurocomputational
model of abductive inference developed by Thagard and Litt (forthcoming).
That model has the representational resources to encode complex objects and
relations, but it has not yet been applied to temporal phenomena involving
change. Hence neural modeling of inferences about mechanisms is a problem
for future research.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
244 P. Thagard
8.
In sum, abduction is multimodal in that can operate on a full range of per-
ceptual as well as verbal representations. It also involves emotional reactions,
both as input to mark a target as worthy of explanation and as output to signal
satisfaction with an inferred hypothesis. Representations are neural structures
consisting of neurons, neuronal connections, and spiking behaviors. In ab-
duction, the relation between hypotheses and targets is causal explanation,
where causality is rooted in perceptual experience. Inference is transforma-
tion of representational neural structures. Such structures are mechanisms,
and abductive inference sometimes applies knowledge of mechanisms and
more rarely and valuably generates new hypotheses about mechanisms.
Much remains to be done to flesh out this account. Particularly needed
is a concrete model of how abduction could be performed in a system of
spiking neurons of the sort investigated by Eliasmith and Anderson (2003)
and Wagar and Thagard (2004). The former reference contains valuable ideas
about neural representation and transformation, while the latter is useful
for ideas about how cognition and emotion can interact. Thagard and Litt
(forthcoming) combine these ideas to provide a fuller account of the neural
mechanisms that enable people to perform abductive inference. Moving the
study of abduction from the domain of philosophical analysis to the realm of
neurological mechanisms has made it possible to combine logical aspects of
abductive inference with multimodal aspects of representation and emotional
aspects of cognitive processing. We can look forward to further abductions
about abduction.
I am grateful to Jennifer Asmuth, Aidan Feeney, Abninder Litt, and Dou-
glas Medin for helpful comments on an earlier draft. Funding for research
was provided by the Natural Sciences and Engineering Research Council of
Canada.
References
Ahn, W., Kalish, C. W., Medin, D. L., & Gelman, S. (1995). The role of covariation versus
mechanism information in causal attribution. Cognition, 54, 299–352.
Baillargeon, R., Kotovsky, L., & Needham, A. (1995). The acquisition of physical knowl-
edge in infancy. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition:
A multidisciplinary debate (pp. 79–116). Oxford: Clarendon Press.
Bechtel, W., & Abrahamsen, A. A. (2005). Explanation: A mechanistic alternative. Studies
in History and Philosophy of Biology and Biomedical Sciences, 36, 421–441.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
Blake, R. M., Ducasse, C. J., & Madden, E. H. (1960). Theories of scientific method: The
renaissance through the nineteenth century. Seattle: University of Washington Press.
Blanchette, I., & Dunbar, K. (2001). Analogy use in naturalistic settings: The influence
of audience, emotion, and goals. Memory & Cognition, 29, 730–735.
Charniak, E., & McDermott, D. (1985). Introduction to artificial intelligence. Reading,
MA: Addison-Wesley.
Churchland, P. M. (1989). A neurocomputational perspective. Cambridge, MA: MIT Press.
Damasio, A. R. (1994). Descartes’ error. New York: G. P. Putnam’s Sons.
Darden, L. (1991). Theory change in science: Strategies from Mendelian genetics. Oxford:
Oxford University Press.
Dunbar, K. (1997). How scientists think: On-line creativity and conceptual change in
science. In T. B. Ward, S. M. Smith, & J. Vaid (Eds.), Creative thought: An investi-
gation of conceptual structures and processes (pp. 461–493). Washington: American
Psychological Association.
Eliasmith, C. (2005). Neurosemantics and categories. In H. Cohen & C. Lefebvre
(Eds.), Handbook of categorization in cognitive science (Vol. 1035–1054). Amsterdam:
Elsevier.
Eliasmith, C., & Anderson, C. H. (2003). Neural engineering: Computation, representation
and dynamics in neurobiological systems. Cambridge, MA: MIT Press.
Feldman, J., & Narayan, S. (2004). Embodied meaning in a neural theory of language.
Brain and Language, 89, 385–392.
Flach, P. A., & Kakas, A. C. (Eds.). (2000). Abduction and induction: Essays on their
relation and integration. Dordrecht: Kluwer.
Fugelsang, J. A., Roser, M. E., Corballis, P. M., Gazzaniga, M. S., & Dunbar, K. N.
(2005). Brain mechanisms underlying perceptual causality. Cognitive Brain Research,
24, 41–47.
Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). (2002). Heuristics and biases: The
psychology of intuitive judgment. Cambridge: Cambridge University Press.
Goel, V., & Dolan, R. J. (2003). Reciprocal neural response within lateral and ven-
tral medial prefrontal cortex during hot and cold reasoning. NeuroImage, 20, 2314–
2321.
Gopnik, A. (1998). Explanation as orgasm. Minds and Machines, 8, 101–118.
Hanson, N. R. (1958). Patterns of discovery. Cambridge: Cambridge University Press.
Harman, G. (1973). Thought. Princeton: Princeton University Press.
Holyoak, K. J., & Thagard, P. (1995). Mental leaps: Analogy in creative thought. Cam-
bridge, MA: MIT Press/ Bradford Books.
Houdé, O., Zago, L., Crivello, F., Moutier, F., Pineau, S., Mazoyer, B., et al. (2001).
Access to deductive logic depends on a right ventromedial prefrontal area devoted
to emotion and feeling: Evidence from a training paradigm. NeuroImage, 14, 1486–
1492.
Josephson, J. R., & Josephson, S. G. (Eds.). (1994). Abductive inference: Computation,
philosophy, technology. Cambridge: Cambridge University Press.
Klahr, D. (2000). Exploring science: The cognition and development of discovery processes.
Cambridge, MA: MIT Press.
Kunda, Z. (1999). Social cognition. Cambridge, MA: MIT Press.
Kunda, Z., Miller, D., & Claire, T. (1990). Combining social concepts: The role of causal
reasoning. Cognitive Science, 14, 551–577.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
246 P. Thagard
Lakoff, G. (1987). Women, fire, and dangerous things. Chicago: University of Chicago
Press.
Langley, P., Shrager, J., Asgharbeygi, N., Bay, S., & Pohorille, A. (2004). Inducing explana-
tory process models from biological time series. In Proceedings of the ninth workshop
on intelligent data analysis and data mining. Stanford, CA.
Langley, P., Simon, H., Bradshaw, G., & Zytkow, J. (1987). Scientific discovery. Cambridge,
MA: MIT Press/Bradford Books.
Leslie, A. M., & Keeble, S. (1987). Do six-month-old infants perceive causality? Cognition,
25, 265–288.
Lien, Y., & Cheng, P. W. (2000). Distinguishing genuine from spurious causes: A coher-
ence hypothesis. Cognitive Psychology, 40, 87–137.
Lipton, P. (2004). Inference to the best explanation (2nd ed.). London: Routledge.
Maass, W., & Bishop, C. M. (Eds.). (1999). Pulsed neural networks. Cambridge, MA: MIT
Press.
Machamer, P., Darden, L., & Craver, C. F. (2000). Thinking about mechanisms. Philoso-
phy of Science, 67, 1–25.
Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation.
New York: Kluwer/Plenum.
Mandler, J. M. (2004). The foundations of mind: Origins of conceptual thought. Oxford:
Oxford University Press.
Mellers, B., Schwartz, A., & Ritov, I. (1999). Emotion-based choice. Journal of Experi-
mental Psychology: General, 128, 332–345.
Michotte, A. (1963). The perception of causality (T. R. Miles & E. Miles, Trans.). London:
Methuen.
Nisbett, R. E., & Ross, L. (1980). Human inference: Strategies and shortcomings of social
judgement. Englewood Cliffs, NJ: Prentice Hall.
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge: Cambridge
University Press.
Peirce, C. S. (1931–1958). Collected papers. Cambridge, MA: Harvard University Press.
Psillos, S. (1999). Scientific realism: How science tracks the truth. London: Routledge.
Read, S., & Marcus-Newhall, A. (1993). The role of explanatory coherence in the con-
struction of social explanations. Journal of Personality and Social Psychology, 65, 429–
447.
Rehder, B., & Burnett, R. C. (2005). Feature inference and the causal structure of
categories. Cognitive Psychology, 50, 264–314.
Rieke, F., Warland, D., de Ruyter van Steveninick, R. R., & Bialek, W. (1997). Spikes:
Exploring the neural code. Cambridge, MA: MIT Press.
Shelley, C. P. (1996). Visual abductive reasoning in archaeology. Philosophy of Science,
63, 278–301.
Sloman, S. A., & Lagnado, D. (2005). The problem of induction. In R. Morrison &
K. Holyoak, J. (Eds.), Cambridge handbook of thinking and reasoning (pp. 95–116).
Cambridge: Cambridge University Press.
Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic
structures in connectionist systems. Artificial Intelligence, 46, 159–217.
Thagard, P. (1988). Computational philosophy of science. Cambridge, MA: MIT
Press/Bradford Books.
Thagard, P. (1992). Conceptual revolutions. Princeton: Princeton University Press.
P1: JZP
0521672443c09 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:0
Thagard, P. (1999). How scientists explain disease. Princeton: Princeton University Press.
Thagard, P. (2000). Coherence in thought and action. Cambridge, MA: MIT Press.
Thagard, P. (2003). Pathways to biomedical discovery. Philosophy of Science, 70, 235–254.
Thagard, P. (2005). Mind: Introduction to cognitive science (2nd ed.). Cambridge, MA:
MIT Press.
Thagard, P. (2006). Hot thought: Mechanisms and applications of emotional cognition.
Cambridge, MA: MIT Press.
Thagard, P., & Litt, A. (forthcoming). Models of scientific explanation. In R. Sun (Ed.),
The Cambridge handbook of computational cognitive modeling. Cambridge: Cambridge
University Press.
Thagard, P., & Shelley, C. P. (1997). Abductive reasoning: Logic, visual thinking, and
coherence. In M. L. Dalla Chiara, K. Doets, D. Mundici, & J. van Benthem (Eds.),
Logic and scientific methods (pp. 413–427). Dordrecht: Kluwer.
Thagard, P., & Shelley, C. P. (2001). Emotional analogies and analogical inference. In
D. Gentner, K. H. Holyoak, & B. K. Kokinov (Eds.), The analogical mind: Perspectives
from cognitive science (pp. 335–362). Cambridge, MA: MIT Press.
van Fraassen, B. (1980). The scientific image. Oxford: Clarendon Press.
Wagar, B. M., & Thagard, P. (2004). Spiking Phineas Gage: A neurocomputational theory
of cognitive-affective integration in decision making. Psychological Review, 111, 67–79.
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
10
However much we may disparage deduction, it cannot be denied that the laws estab-
lished by induction are not enough.
Frege (1884/1974, p. 23)
248
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
The last expression is also of the form 1/2 n (n + 1). So this sum formula
necessarily holds for all natural numbers.
By contrast, empirical induction leads to conclusions that are not nec-
essarily true but only probably true, even when these conclusions involve
mathematical relationships. For example, we might hope to establish induc-
tively that people forget memories at a rate equal to y = b − m ln(t), where
t is time since the remembered event and m and b are constants. To support
such a conclusion, we might examine forgetting rates for many people and
many event types and show that forgetting follows the proposed function in a
convincing range of cases (Rubin & Wenzel, 1996). Even if the function fit the
data perfectly, though, no one would suppose that the evidence established
the conclusion as necessarily true. It’s not difficult to imagine changes in our
mental make up or changes in the nature of our everyday experience that
would cause us to forget events according to some other functional form. The
best we can hope for is a generalization that holds for people who are similar
to us in causally important ways (see Heit’s Chapter 1 in this volume for an
overview of empirical induction).
We were able to convince our colleague that this line of thinking was correct.
Or perhaps he was just too polite to press the point. In fact, we still believe
that this standard answer isn’t too far off the mark. But we also think that
we dismissed our colleague’s objection too quickly. One worry might be that
the theoretical distinction between math induction and empirical induction
is not as clear as we claimed. And, second, even if the theoretical difference
were secure, it wouldn’t follow that the psychological counterparts of these
operations are distinct. In this chapter we try to give a better answer to the
objection by examining ways that induction could play a role in mathematics.
In doing so, we won’t be presenting further empirical evidence for a dissocia-
tion between inductive and deductive reasoning. We will be assuming that the
previous evidence we’ve cited is enough to show a difference between them
in clear cases. Our hope is to demonstrate that potentially unclear cases –
ones in which people reach generalizations – don’t compromise the distinc-
tion (see Oaksford & Hahn, Chapter 11 in this volume, for a more skeptical
view of the deduction/induction difference).
We consider first the relationship between math induction and empirical
induction to see how deep the similarity goes. Although there are points of
resemblance, we argue that there are crucial psychological differences between
them. This isn’t the end of the story, however. There are other methods of
reaching general conclusions in math besides math induction, and it is possible
that one of them provides the sort of counterexample that our colleague was
seeking. One such method involves reasoning from an arbitrary case – what is
called universal generalization in logic. Several experiments suggest that people
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
sometimes apply this method incorrectly, and we examine these mistakes for
evidence of a continuum between inductive and deductive methods. We
claim, however, that although the mistakes show that people use inductive
procedures when they should be sticking to deductive ones, they provide no
evidence against a qualitative deduction/induction split.
Of course, we won’t be claiming that there’s no role for induction in math-
ematics. Both students and professional mathematicians use heuristics to
discover conjectures and methods that may be helpful in solving problems
(see Polya, 1954, for a classic statement on the uses of induction in mathemat-
ics). Our claim is simply that these inductive methods can be distinguished
from deductive proof, a point on which Polya himself agreed: “There are two
kinds of reasoning, as we said: demonstrative reasoning and plausible reason-
ing. Let me observe that they do not contradict each other; on the contrary
they complete each other” (Polya, 1954, p. vi).
2 Use of math induction as a proof technique was, of course, well established before the intro-
duction of (IAx) as a formal axiom. Medieval Arabic and Hebrew sources use implicit versions
of math induction to prove theorems for finite series and for combinations and permutations
(Grattan-Guinness, 1997). According to Kline (1972, p. 272), “the method was recognized ex-
plicitly by Maurolycus in his Arithmetica of 1575 and was used by him to prove, for example,
that 1 + 3 + 5 + · · · + (2n − 1) = n2 .”
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
And so on. In general, this modus ponens strategy requires k modus ponenses
to conclude that k satisfies the sum formula, and it will be impossible to
reach the conclusion for all natural numbers in a finite number of steps.
(IAx) sidesteps this worry by reaching the conclusion for all n in one step.
However, putting the “and so on . . . ” idea “squarely in the actual definition
of the natural numbers,” as Stewart and Tall suggest, isn’t a very comforting
solution if this means hiding the “and so on . . . ” pea under another shell.
What justifies such an axiom?
There are several ways to proceed. First, we could try to eliminate the “and
so on . . . ” doubt by showing that (IAx) is inherently part of people’s thinking
about the natural numbers and, in this sense, is “squarely in” their definition.
According to this idea, any kind of intuitive understanding of these numbers
already allows you to conclude in one go that if 0 is in S and if k + 1 is in S
whenever k is, then all natural numbers are in S. This is close to Poincaré’s
(1902/1982) view. Although he acknowledged that there is “a striking analogy
with the usual procedures of [empirical] induction,” he believed that math-
ematical induction “is the veritable type of the synthetic a priori judgment.”
By contrast with empirical induction, mathematical induction “imposes itself
necessarily, because it is only the affirmation of a property of the mind itself ”
(pp. 39–40). Thus, any attempt to found math induction on a formal axiom
like (IAx) can only replace an idea that is already intuitively clear with one
that is more obscure. Goldfarb (1988) points out that the psychological na-
ture of math induction was irrelevant to the goals of those like Frege, Hilbert,
and Russell, who were the targets of Poincaré’s critique (see Hallett, 1990, for
debate about these goals). But perhaps Poincaré’s theory is sufficient to settle
specifically psychological doubts. Another possibility, one more consistent
with Frege’s (1884/1974) and Russell’s (1919) viewpoint, is that (IAx) is the
result of a principle that applies in a maximally domain-general way and is
thus a precondition of rationality. We’ll look more closely at the source of this
generality in just a moment in examining Frege’s treatment of (IAx).
We favor a view in which math induction is inherent in the concept of the
natural numbers – whether this is specific to the natural-number concept or
is inherited from more general principles – and this view preserves a crisp
distinction between deductive and inductive reasoning. But can we really
rule out the possibility that the “and so on . . .” doubt is a symptom of a true
intermediate case? Maybe (IAx) merely plasters over a gap like that in ordinary
induction – the “and so on . . .” gap that is left over from the modus ponens
strategy.
One thing that seems clear is that in actually applying math induction –
for example, in our use of it to prove the sum formula – the modus ponens
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
where N0 is the set of natural numbers and s (·) is the successor function, as
before.
From this definition, mathematical induction follows, just as Stewart and
Tall (1977) claimed. For suppose S is a set that fulfills the two conditions on
(IAx), as in (2):
topic, Smith (2002, p. 6) cites a definition similar to (IAx) and then goes
on to remark, “The definition secures a generalization (universalization).
The reference in the premises is to a property true of any (some particular)
number. But the reference in the conclusion is to any number whatsoever,
any number at all. Note that the term any is ambiguous here since it covers
an inference from some-to-all.” The word any does not, in fact, appear in the
definition that Smith cites. The relevant clause is “ . . . and if it is established
that [the property] is true of n + 1 provided it is true of n, it will be true
of all whole numbers.” But what Smith seems to mean is that the proof is a
demonstration for some particular case, k and k + 1, whereas the conclusion
is that the property is true for all natural numbers n. Similarly, in discussing
earlier work by Inhelder and Piaget, Smith states:
The inference that Smith is pointing to, however, is not a feature of math
induction per se but of the universal generalization that feeds it. Universal
generalization, roughly speaking, is the principle that if a predicate can be
proved true for an arbitrary member of a domain, it must be true for all
members. A formal statement of this rule can be found in most textbooks
on predicate logic. We have adapted slightly the following definition from
Lemmon (1965):
(UG) Let P (e) be a well-formed formula containing the arbitrary name e, and
x be a variable not occurring in P (e); let P (x) be the propositional function in
x which results from replacing all and only occurrences of e in P (e) by x. Then,
given P (e), UG permits us to draw the conclusion (∀x)P (x), provided that e
occurs in no assumption on which P (e) rests.
For example, when we proved that the sum formula holds for k + 1 if it
holds for k, we used no special properties of k except for those common to
all natural numbers. In this sense, k functioned as an arbitrary name. Hence
(by universal generalization), for all natural numbers n, the sum formula
holds of n + 1 if it holds for n. This generalization is needed to satisfy the
second condition of (IAx), but it is also needed to supply the major premises
for the modus ponens strategy and, of course, for much other mathematical
reasoning. The confusion between (IAx) and (UG) is an understandable one.
We could rephrase (IAx) in a way that refers to an arbitrary case: if S is a
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
An example of this sort may increase your confidence that the sum formula
is true, but it doesn’t produce a deductively valid proof. However, there’s an
intuitive kinship between this pseudo-proof and the earlier correct proof of the
sum formula. The phrase arbitrary number captures this kinship, since it can
mean either a specific number that we select in some haphazard way or it can
mean a placeholder that could potentially take any number as a value. Sliding
between these two meanings yields an inductive, example-based argument,
on one hand, and a deductively valid (UG)-based proof, on the other. For the
sum formula, it’s the difference between the argument from seven that we just
gave and the correct argument with k.
(UG) is pervasive in mathematics, in part because most algebraic ma-
nipulations depend on it. Variables like k turn up everywhere, not just in
the context of mathematical induction. Thus, confusion about the nature of
variables is apt to have wide-spread consequences and blur the distinction
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
In Smith’s (2002) study of children’s arithmetic, the key question to partici-
pants was: “If you add any number [of toys] at all to one pot and the same
number to the other, would there be the same in each, or would there be
more in one than in the other?” A correct answer to this question seems to
depend on the children realizing that for all numbers x and quantities m
and n, x + m = x + n if and only if m = n. In the experiment, m and n are
the initial quantities of toys in the pots (m = n = 0 in one condition of the
study, and m = 1 and n = 0 in another). It is certainly possible to prove this
principle using (IAx), but it is far from clear that this is the way children were
approaching the problem. Reasoning along the lines of (UG) would mean
going from “for arbitrary k, (m + k = n + k) if and only if (m = n)” to the
above conclusion for all x. But again it is unclear that this is how children
went about answering this question. Among the children’s justifications that
Smith quotes on this problem, the only one that directly suggests (UG) is “I’m
putting like a number in the orange pot and a number in the green pot, and
it’s got to be the same.”
There is plenty of reason, however, to suspect that children in this experi-
ment were using specific examples to justify their decisions. Some simply cited
a specific number, for example, “If you added a million in there, there would
be a million, and if you added a million in there, there would be a million,”
or “it would be just a million and six in there and a million and six in there.”
Others gave a more complete justification but also based on a specific number
of items: “If we started putting 0 in there and 1 in there, and then adding 5
and 5, and so it wouldn’t be the same – this would be 5 and 6.” Instead of the
(UG) strategy of thinking about an arbitrary addition, these children seem
to generalize directly from a particular number, such as 5 or 1,000,006, to
all numbers. The numbers may be selected arbitrarily but they aren’t capable
of denoting any natural number in the way (UG) requires. It is possible, of
course, that these children were using a correct general strategy and simply
citing specific cases to aid in their explanations, but the experiments we are
about to review suggest than even college students evaluate “proof by specific
examples” as a valid method.
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
Algebra
Asking college student to prove simple propositions in number theory often
results in “proof by multiple examples,” such as the following one (Eliaser,
2000):
Show that every odd multiple of 5 when squared is not divisible by 2.
Proof:
Odd multiples of 5: 5, 15, 25 . . .
52 = 25 → not divisible by 2.
152 = 225 → not divisible by 2.
Therefore shown.
In the experiment from which we took this excerpt, Northwestern University
undergraduates had to prove a group of simple statements, half of which had
a universal form, like the one above, requiring an algebraic proof. Although
nearly all these students were able to use algebra (as they demonstrated in a
post test), only 41% initially employed it in proving universal propositions.
Instead, 33% used multiple examples (as in the sample “proof ”) and 17%
used single examples.
Of course, students may not always see the right proof strategy immediately
and may fall back on examples instead. For instance, the students may not
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
remember or may not know how to set up a problem that requires represent-
ing the parity of a number (i.e., that all even numbers have the form 2x and
all odd ones 2x + 1). But this doesn’t seem to be the whole story behind the
students’ dismal performance. In a second study, Eliaser (2000) gave potential
proofs to a new group of undergraduates and asked them to rate the quality
of the proof on a four-point scale from “shows no support” to “completely
proves.” The “proofs” included single-example, multiple-example, and alge-
braic versions. But despite the fact that students didn’t have to produce their
own proofs and only had to evaluate the proofs of others, they nevertheless
rated multiple-example pseudo-proofs higher than either single-example or
(correct) algebraic proofs, for universal claims, such as the one about odd
multiples of 5.
Depressing performance of a similar kind comes from a study of students
who were training to be elementary math teachers (Martin & Harel, 1989).
These students rated proofs and pseudo-proofs for two number-theoretic
statements (e.g., If a divides b and b divides c , then a divides c ). The pseudo-
proofs included a typical example (e.g., “12 divides 36, 36 divides 360, 12
divides 360”) and an example with a big, random-looking number:
Let’s pick any three numbers, taking care that the first divides the second, and the
second divides the third: 49 divides 98, and 98 divides 1176. Does 49 divide 1176?
(Computation shown to left.) The answer is yes.
Take 4, 8, and 24. 4 divides 8, which means that there must exist a number, in
this case 2, such that 2 × 4 = 8. 8 divides 24, which means there must exist a
number, in this case 3, such that 3 × 8 = 24. Now substitute for 8 in the previous
equation, and we get 3 × (2 × 4) = 24. So we found a number (3 × 2), such that
(3 × 2) × 4 = 24. Therefore, 4 divides 24.
Students rated these pseudo-proofs, along with other pseudo- and correct
proofs on a scale from 1 (not considered a proof) to 4 (a mathematical proof).
If we take a rating of 4 to be an endorsement of the argument as a proof, then
28% endorsed the correct proof, 22% the instantiated pseudo-proof, 39% the
specific example, and 38% the big random-looking example.
Geometry
Students’ use of example-based proof strategies isn’t limited to algebra.
It’s a common classroom warning in high school geometry that students
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
shouldn’t draw general conclusions from the specific cases that they see in the
diagrams accompanying a problem. Diagrams represent one way in which
the givens of a problem could be true, not all possible ways. Despite these
warnings, students do generate answers to test questions based on single
diagrams.
Evidence on this point comes from a study by Koedinger and Anderson
(1991). This study presented students with yes/no questions about geometric
relationships, accompanied by one of two diagrams. For example, one prob-
lem asked, “If ∠BCD = ∠ACD, must AB ⊥ CD?” The two possible diagrams
for this problem appear as Figures 10.1a and 10.1b. For half the problems
the correct answer was “yes,” as in this example. The remaining problems
had “no” as the correct answer, for instance, “If AC = CB, must AB⊥CD?”
The two possible diagrams for this problem are Figures 10.1c and 10.1d. For
each of these pairs of problems, the given information and the conclusion
were true of one diagram (Figures 10.1a and 10.1c) and were false of the
other (Figures 10.1b and 10.1d). Individual participants saw only one of the
possible diagrams per problem.
The results showed that students were usually able to get the right answer:
they were correct on about 68% of trials. For problems where the correct
answer was “yes” (so that a proof was possible), the students performed more
accurately when the corresponding proof was easy than when it was difficult,
suggesting that they were attempting such a proof. Nevertheless, they tended to
be swayed by the diagrams: they made more “yes” responses in all conditions
when the diagram pictured the given information and the conclusion (as
in Figures 10.1a and 10.1c) than when it didn’t (Figures 10.1b and 10.1d).
Koedinger and Anderson take this pattern of results to mean that the students
were attempting to find a proof but were using the diagram as a backup
inductive strategy if no proof was forthcoming (a “modified misinterpreted
necessity model” in the jargon of reasoning research; see Evans, Newstead, &
Byrne, 1993).
Further evidence that students rely on diagrams for inductive support
comes from one of our own studies of non-Euclidean geometry. In this
experiment, we gave participants a short introduction to hyperbolic geometry,
which differs from Euclidean geometry by the following hyperbolic axiom: for
any given line and point p not on , there are multiple lines through p
parallel to . The participants then solved a number of problems in the new
geometry while thinking aloud. Figure 10.2 illustrates one such problem. To
avoid confusing participants about the meaning of parallel, we substituted the
nonsense term cordian, which we defined according to the hyperbolic axiom.
For the problem in Figure 10.2, we told participants that lines and m are
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
a. D
If ∠ BCD = ∠ ACD,
must AB ⊥ CD?
A B
C
b.
D
If ∠ BCD = ∠ ACD,
must AB ⊥ CD?
A B
C
c. D
If AC = CB ,
must AB ⊥ CD?
A B
C
d.
D
If AC = CB ,
must AB ⊥ CD?
A B
C
figure 10.1. Stimulus problems from a study of geometry problem solving (adapted
from Koedinger and Anderson, 1991, Figure 2).
cordian, that ∠BAC and ∠BDC were 90˚ angles, and asked them to decide
whether triangle ABC was congruent to triangle DCB.3
3 If the two triangles ABC and DCB are congruent, corresponding angles of these triangles must
also be congruent. In Figure 10.2, ∠BAC and ∠CDB are given as congruent, but what about the
pair ∠DCB and ∠ABC and the pair ∠BCA and ∠CBD? In Euclidean geometry, the congruence
of each of these pairs follows from the fact that alternate interior angles formed by a transversal
and two parallel lines are congruent. In hyperbolic geometry, however, the alternate interior
angles formed by a transversal and two parallel (“cordian”) lines are not congruent. For example,
in Figure 10.2, ∠ABC and ∠DCB are both the alternate interior angles formed by the cut of
transversal BC across parallel lines and m. But because these corresponding angles cannot be
proved congruent, neither can the two triangles.
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
B
D l
A C
figure 10.2. Stimulus problem for a study of non-Euclidean geometry.
Implications
These examples show that people sometimes use inductive strategies where de-
ductive methods are necessary. More surprisingly, people sometimes evaluate
the inductive methods more favorably than the correct deductive ones. We
can explain the first of these findings on the assumption that people resort to
inductive methods when they run out of deductive ones – either because no
proof is possible or because they can’t find one – as Koedinger and Anderson
(1991) propose. But the evaluation results make it likely that this isn’t the
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
whole story. Why do people find inductive pseudo-proofs better than true
proofs when they’re able to assess them directly?
The answer may lie in the relative complexity of the true proofs versus the
inductive lures. Examples are likely to be easier to understand than proofs,
especially if the proofs are lengthy. Similarly, there’s a difference in concrete-
ness between proofs with individual numbers and those with arbitrary names
(i.e., mathematical variables) in algebra. Geometry diagrams also have a con-
creteness advantage over the proofs that refer to them. Even among correct
deductive proofs, we evaluate the simpler, more elegant ones as superior to
their more complex competitors, and it is possible that students’ incorrect
evaluations are a more extreme version of the same tendency.
Explanations along these lines do no harm to the thesis that deductive and
inductive reasoning make use of qualitatively different mechanisms. These
explanations take the induction/deduction difference for granted and account
for students’ errors on the basis of factors, such as complexity or abstractness,
that swamp deductive methods or make them seem less attractive. We know
of no experimental results that can eliminate these explanations. But it’s
also interesting to consider the possibility of a more continuous gradient
between inductive and deductive strategies. One proposal along these lines
might be based on the idea that people search for increasingly nonobvious
counterexamples to an argument or theorem. If they are unable to find any
counterexamples, either because there are none or because the individuals
can no longer hold enough examples in working memory, they declare the
argument valid. In the case of algebra, we can imagine students first choosing
a specific number that the problem suggests and that meets the theorem’s
givens. For example, in the problem mentioned earlier of showing that the
squares of odd multiples of 5 are not divisible by 2, the participant began
by verifying that 52 is not divisible by 2. If the first instance does not yield
a counterexample, the students try another (e.g., (3 × 5)2 ) until they either
discover a counterexample or run out of working-memory resources. In the
case of geometry, students might first verify that the conclusion is true of
the diagram that appears with the problem. They may then consider new
diagrams of their own in the same search for counterexamples.
In the present context, one difficulty with this line of thinking, however,
is that it doesn’t account for Koedinger and Anderson’s (1991) evidence that
students also attempt to apply a separate deductive strategy and resort to
diagrams only when the first strategy fails. The counterexample hypothesis
predicts instead that students should begin by considering the given diagram,
look for other diagrams that might falsify the conclusion, and end by accepting
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
the theorem as valid if it correctly applies to all the diagrams they have
examined. In addition, there is a theoretical problem. In the case of a universal
proposition – such as the sum formula that applies to all natural numbers –
there’s no endpoint at which you can stop and be sure that no counterexamples
will be found. There will always be additional numbers that have features
other than the ones you’ve so far considered. Rather than provide a continuum
between inductive and deductive reasoning, this strategy never provides more
than weak inductive support. Of course, the fact that a pure counterexample
strategy is inadequate as an explanation of the data doesn’t mean that there
aren’t other methods intermediate between induction and deduction that
may be more reasonable. But we know of no convincing cases of this type.
Another way to put this point is to say that there is no number or diagram
that is “completely arbitrary” in a way that could prove a universal proposition.
“Each number has its own peculiarities. To what extent a given particular
number can represent all others, and at what point its own special character
comes into play, cannot be laid down generally in advance” (Frege, 1884/1974,
p. 20). The sort of arbitrariness that does prove such a proposition – and that
(UG) demands – is a matter of abstractness, not randomness or atypicality.
Arbitrariness in the latter sense drives you to more obscure cases as potential
counterexamples. But the sort of arbitrariness that is relevant to proving a
universal proposition pulls in the opposite direction of inclusiveness – an
instance that could potentially be any number.
gap that creates the similarity with empirical induction, since in both cases
the conclusion of the argument seems beyond what the premises afford. An
induction axiom like (IAx) makes up for this gap in formal contexts, but why
is math induction a deductively correct proof tool if it can’t be backed up by
the infinite iteration it seems to require?
We’ve argued elsewhere (Rips, Bloomfield, & Asmuth, 2007) that math
induction is central to knowledge of mathematics: it seems unlikely that people
could have correct concepts of natural number and other key math objects
without a grip on induction. Basic rules of arithmetic, such as the associative
and commutative laws of addition and multiplication, are naturally proved
via math induction. If this is right, then it’s a crucial issue how people come
to terms with it. We’ve tried to argue here that the analogy we’ve just noted –
enumeration : conclusion by empirical induction :: modus ponens strategy :
conclusion by math induction – is misleading. Math induction doesn’t get its
backing from the modus ponens strategy. Rather the modus ponens strategy
plays at most a propaedeutic role, revealing an abstract property of the natural
number system that is responsible for the correctness of math induction. It is
still possible, of course, that math induction picks up quasi-inductive support
from its fruitfulness in proving further theorems, but in this respect it doesn’t
differ from any other math axiom.
Although we think that math induction doesn’t threaten the distinction
between deductive and inductive reasoning, there is a related issue about
generalization in math that might. Math proofs often proceed by selecting
an “arbitrary instance” from a domain, showing that some property is true
of this instance, and then generalizing to all the domain’s members. For this
universal generalization to work, the instance in question must be an abstrac-
tion or stand-in (an “arbitrary name” or variable) for all relevant individuals,
and there is no real concern that such a strategy is not properly deductive.
However, there’s psychological evidence that students don’t always recognize
the difference between such an abstraction and an arbitrarily selected exem-
plar. Sometimes, in fact, students use exemplars in their proofs (and evaluate
positively proofs that contain exemplars) that don’t even look arbitrary but
are simply convenient, perhaps because the exemplars lend themselves to con-
crete arguments that are easy to understand. In these cases, students are using
an inductive strategy, since the exemplar can at most increase their confidence
in the to-be-proved proposition.
It’s no news, of course, that people make mistakes in math. And it’s also
no news that ordinary induction has a role to play in math, especially in
the context of discovery. The question here is whether these inductive intru-
sions provide evidence that the deduction/induction split is psychologically
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
untenable. Does the use of arbitrary and not-so-arbitrary instances show that
people have a single type of reasoning mechanism that delivers conclusions
that are quantitatively stronger or weaker, but not qualitatively inductive ver-
sus deductive? We’ve considered one way in which this might be the case.
Perhaps people look for counterexamples, continuing their search through
increasingly arbitrary (i.e., haphazard or atypical) cases until they’ve found
such a counterexample or have run out of steam. The longer the search, the
more secure the conclusion. We’ve seen, however, that this procedure doesn’t
extend to deductively valid proofs; no matter how obscure the instance, it
will still have an infinite number of properties that prohibit you from gen-
eralizing from it. It is possible to contend that this procedure is nevertheless
all that people have at their disposal – that they can never ascend from their
search for examples and counterexamples to a deductively adequate method.
But although the evidence on proof evaluation paints a fairly bleak picture
of students’ ability to recognize genuine math proofs, the existence of such
proofs shows they are not completely out of reach.
References
Dedekind, R. (1963). The nature and meaning of numbers (W. W. Beman, Trans.).
In Essays on the theory of numbers (pp. 31–115). New York: Dover. (Original work
published 1888.)
Eliaser, N. M. (2000). What constitutes a mathematical proof? Dissertation Abstracts
International, 60 (12), 6390B. (UMI No. AAT 9953274)
Evans, J. St. B. T., Newstead, S. E., & Byrne, R. M. J. (1993). Human reasoning. Hillsdale,
NJ: Erlbaum.
Fallis, D. (1997). The epistemic status of probabilistic proofs. Journal of Philosophy, 94,
165–186.
Frege, G. (1974). The foundations of arithmetic (J. L. Austin, Trans.). Oxford, England:
Blackwell. (Original work published 1884.)
Goldfarb, W. (1988). Poincaré against the logicists. In W. Aspray & P. Kitcher (Eds.),
History and philosophy of modern mathematics (pp. 61–81). Minneapolis: University
of Minnesota Press.
Goel, V., Gold, B., Kapur, S., & Houle, S. (1997). The seats of reason? An imaging study
of deductive and inductive reasoning. NeuroReport, 8, 1305–1310.
Grattan-Guinness, I. (1997). The Norton history of the mathematical sciences. New York:
Norton.
Greenberg, M. J. (1993). Euclidean and non-Euclidean geometries (3rd ed.). New York:
Freeman.
Hallett, M. (1990). Review of History and philosophy of modern mathematics. Journal of
Symbolic Logic, 55, 1315–1319.
Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of representa-
tiveness. Cognitive Psychology, 3, 430–454.
Kaye, R. (1991). Models of Peano arithmetic. Oxford, England: Oxford University Press.
P1: JZP
05216772443c10 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:41
Kline, M. (1972). Mathematical thought from ancient to modern times. Oxford, England:
Oxford University Press.
Knuth, D. E. (1974). Surreal numbers. Reading, MA: Addison-Wesley.
Koedinger, K. R., & Anderson, J. R. (1991). Interaction of deductive and inductive rea-
soning strategies in geometry novices. Proceedings of the Thirteenth Annual Conference
of the Cognitive Science Society, 780–784.
Lemmon, E. J. (1965). Beginning logic. London: Nelson.
Martin, W. G., & Harel, G. (1989). Proof frames of preservice elementary teachers.
Journal for Research in Mathematics Education, 20, 41–51.
Osherson, D., Perani, D., Cappa, S., Schnur, T., Grassi, F., & Fazio, F. (1998). Distinct
brain loci in deductive versus probabilistic reasoning. Neuropsychologia, 36, 369–376.
Poincaré, H. (1982). Science and hypothesis (G. B. Halsted, Trans.). In The foundations
of science. Washington, DC: University Press of America. (Original work published
1902.)
Polya, G. (1954). Induction and analogy in mathematics. Princeton, NJ: Princeton Uni-
versity Press.
Quine, W. V. (1950). Methods of logic (4th ed.). Cambridge, MA: Harvard University
Press.
Rips, L. J. (2001a). Two kinds of reasoning. Psychological Science, 12, 129–134.
Rips, L. J. (2001b). Reasoning imperialism. In R. Elio (Ed.), Common sense, reasoning,
and rationality (pp. 215–235). Oxford, England: Oxford University Press.
Rips, L. J., Bloomfield, A., & Asmuth, J. (2007). Number sense and number nonsense.
Manuscript submitted for publication.
Rubin, D. C., & Wenzel, A. E. (1996). One hundred years of forgetting. Psychological
Review, 103, 734–760.
Russell, B. (1919). Introduction to mathematical philosophy. New York: Dover.
Smith, L. (2002). Reasoning by mathematical induction in children’s arithmetic. Amster-
dam: Pergamon.
Stewart, I., & Tall, D. (1977). The foundations of mathematics. Oxford, England: Oxford
University Press.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
11
269
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
p → q , ¬q
(2)
∴ ¬p
are always endorsed at much lower rates. One interpretation of these phe-
nomena is that for these inferences people are assessing inductive strength,
on the assumption that the proportion of people endorsing an inference is a
reflection of the underlying subjective degree of belief that people have about
the conditional (Evans & Over, 2004).
Theoretically, one good reason to believe that people should not be partic-
ularly sensitive to deductive relations is that standard logic is monotonic; in
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
other words the law of strengthening the antecedent holds that is,
p→q
(3)
∴ (p ∧ r) → q
However, it seems that for most inferences that people are likely to encounter
in their everyday lives, this logical law fails.2 For example, if x is a bird, then x
flys does not entail that if (x is a bird and x is an Ostrich), then x flys. As the
logical standard against which deductive correctness is normally assessed in
psychology is monotonic, that is strengthening of the antecedent is valid, per-
haps people should not be sensitive to deductive relations. Non-monotonicity
is the sine qua non of inductive reasoning. That is, each observation, bird A
can fly, bird B can fly, and so forth can be viewed as a premise in the induc-
tive argument to the conclusion that birds fly. However, this conclusion can
be defeated (logically) by the observation of one non-flying bird. Thus the
observation that most everyday arguments seem to be non-monotonic seems
to suggest that they are mainly assessed for argument strength, as for these
arguments deductive correctness may be unattainable.
Rips (2001a, b; 2002; see also Rips [this volume] for further discussion)
has argued that trying to make inductive strength do service for deductive
correctness (Oaksford & Chater, 1998, 2001) or trying to make deductive
correctness do service for inductive strength (Johnson-Laird, 1994; Johnson-
Laird, Legrenzi, Girotto, Legrenzi, & Caverni, 1999) is misguided, and he refers
to these attempts as reasoning imperialism. In this chapter, we argue that our
position is not an imperialist one. Rather, it is that while people may well be
capable of assessing deductive correctness when explicitly asked to, this is rarely,
if ever, their focus of interest in evaluating an argument. Thus we will conclude
that inductive strength is probably more important in determining people’s
behaviour than deductive correctness even on putative deductive reasoning
tasks. However, despite our emphasis on inductive strength, we still think that
deductive relations, or structure, may be crucial to human inference, as we
point out in Section 4.
In Section 1, we begin by examining Rips’ (2001) own evidence and argu-
ments for a clear distinction between these two ways of evaluating inferences.
We argue (i) that the data are perhaps not as convincing as they may first
seem, (ii) that establishing the claim of sensitivity to deductive correctness
can not be established by examining isolated inferences but only for a whole
2 This can be readily established by a simple thought experiment. Think of any conditional
statement about the everyday world you like and see how easy it is to generate exceptions to the
rule (see also Chater & Oaksford, 1996).
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
logic, (iii) that the generality of a concept of a logical system makes it difficult
to refute the claim that people are sensitive to deductive relations in some
logic, but (iv) that given (ii), it is clear that people probably should not be
sensitive to standard deductive validity because it leads to many paradoxes.
In Section 2, we move on to look at conditional inference, as in the examples
of modus ponens and modus tollens above. Here we argue that an account based
on probability logic (Adams, 1975, 1998) which uses an account of inductive
strength resolves many of the paradoxes observed for standard deductive
validity and moreover provides a better account of the data. We also show
that inductive processes, that is, updating beliefs in the light of evidence by
Bayesian revision, may be directly involved in drawing conditional inferences
because of possible violations of what is called the rigidity condition for
Jeffrey conditionalization. We also evaluate the inductive strength of some
conditional argument forms. In particular, we show that people may judge
some conditional reasoning fallacies as being as strong as a deductively correct
inference.
Finally, we show in Section 3 how some very similar proposals address infor-
mal argument fallacies, for example, the argument from ignorance, circularity,
and the slippery slope argument. That is, invoking the concept of argument
or inductive strength can resolve the paradox of why some instances of in-
formal argument fallacies seem perfectly acceptable and capable of rationally
persuading someone of a conclusion.
The theme of this paper could be characterized as an attempt to demonstrate
that an account of argument strength helps resolve many paradoxes and
fallacies in reasoning and argumentation. A logical paradox (see Section 1.2
on the paradoxes of material implication) is usually taken to mean an example
of a logically valid inference, for which there seem to be instances that are
considered obviously unsound.3 Following logicians like Adams (1975, 1998),
we suggest that this may be because these inferences are inductively weak. To
commit a logical fallacy is to endorse an argument that is not logically valid.
What is paradoxical about the logical fallacies is that there seem to be many for
which there are instances that seem to be perfectly good arguments. Indeed
this point extends to a whole range of what are typically called informal
3 There is a potentially important distinction between valid but unsound inferences with true
premises and inferences with false premises that are unsound. It has been argued (Ferguson,
2003) that the defeasibility revealed by violations of strengthening the antecedent is not prob-
lematic because the violations are instances where the premises are strictly false, for example, it
is known that not all birds fly. However, the paradoxes of material implication, as we will see,
seem to be instances where not only is the inference intuitively unsound, but the premises are
also considered true.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
reasoning fallacies in logic text books (e.g., Copi & Cohen, 1990). Although
people have been warned against committing these fallacies since Aristotle’s
Sophistical refutations, examples continue to crop up that seem like perfectly
convincing arguments. We suggest that a probabilistic notion of argument
strength allows instances of fallacies that are convincing to be distinguished
from instances that are not convincing.
The upshot of all these arguments is not to deny that people possess some
understanding of deductive validity, but rather to establish that in determining
human inferential behaviour, argument strength is the more central concept.
Thus, in the conclusion, we point out, by considering some recent approaches
in AI, that without some structure it would not be possible to appropriately
transmit inductive strengths from the premises to the conclusions of com-
plex arguments. Consequently, deductive correctness and inductive strength
should be seen as working together rather than as independent systems.
The premise above the line does not logically entail the conclusion below the
line, but the probability of the conclusion given the premise is high because it
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
Not(Car X10 runs into a brick wall and Car X10 stops) (5)
Car X10 runs into a brick wall
Car X10 does not stop
4 This question is also highly ambiguous as not even experts agree on the “form” of English
sentences. So exactly what participants understood from this instruction is unclear.
5 For conjunction elimination, p and q therefore q, the following argument form was used as the
logically incorrect and causally consistent case, p therefore p and q. That is, rather than omit a
major premise a conjunct was removed again, leading to an unstructured premise.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
Consequently, Rips (2001) replaced Car X10 stops with Car X10 speeds on. The
conclusion is then that Car X10 does not speed on, which is causally consistent
with the minor premise. Constructing the deductively incorrect but causally
consistent case, while retaining some logical structure, could be achieved by
using the premises in (5) but reversing the polarity of the conclusion, that
is, Car X10 stops. The problem about (5), even with this substitution, is that
the major premise seems to express a causally inconsistent proposition. One
of the other facts about Car X10 that seems to have to be true when it hits a
wall is that it stops, but the major premise denies this (assuming that there is
only one Car X10 and that the two conjuncts are indexed to the same space-
time location). Similar arguments apply for modus ponens and disjunctive
syllogism. These considerations suggest that including some logical structure
for the deductively incorrect but casually consistent cases would appear to
have to introduce some level of causal inconsistency.
Despite the absence of logical structure for the deductively incorrect and
causally consistent cases in Rips (2001), 35% of these arguments were judged
deductively correct. We believe that a proper test of the hypothesis would have
to involve using materials like (5) but with the reversed polarity conclusion
for the deductively incorrect but causally consistent cases. We suspect that the
additional structure may lead to more judgements of deductive correctness
while the presence of a causally inconsistent major premise may lead to fewer
judgements of causal consistency. The combined effect would be to reduce
the gap between these two types of judgements found by Rips (2001, see
also Heit & Rotello, 20056 ). In sum, the way the deductively incorrect but
causally consistent cases were constructed seems to beg the question against
the unitary view. Thus, we don’t interpret the existing evidence using the Rips
(2001) paradigm as convincing evidence against a unitary view.
In this section we have concentrated on arguments that are deductively in-
correct but inductively strong and argued that these may be more closely
aligned than Rips (2001) might lead one to expect. In the next section,
we concentrate on arguments that are deductively strong but inductively
weak.
6 Heit and Rotello (2005) used similar materials to Rips (2001) but used abstract material (Ex-
periment 1) and analysed their results using signal detection theory. The results were similar
to Rips (2001). Heit and Rotello’s (2005) Experiment 2 used quantified statements where class
inclusion (all birds have property A, therefore robins have property A) was respected or violated
(premise and conclusion inverted). While these results are interesting, our feeling is that they
tell us most about the semantics/pragmatics of the words “necessary” and “plausible,” where it
has been known for some time that they don’t neatly carve up some underlying probability scale
(see Dhami & Wallsten, (2005); Rapoport, Wallsten, & Cox, 1987; Wallsten, Budescu, Rapoport,
Zwick, & Forsyth, 1986 ).
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
people would regard these as deductively correct, and that is why they are
regarded as paradoxes. However, these turned out to be inductively weak,
which explains their paradoxical status.8
In the next section, we explore psychological accounts of the conditional
more fully. In particular, we look at recent accounts of the psychological
data based on Adams’s (1975, 1998) probability conditional. Following some
recent proposals (Oaksford, 2005), we suggest that not only are conditionals
assessed for inductive strength in psychological experiments, but inductive
learning processes may also be invoked in assessing conditional inferences.
We also evaluate the inductive strength of the conditional arguments usually
investigated in the psychology of reasoning.
.
Along with many other researchers (e.g., Anderson, 1995; Chan & Chua,
1994; George, 1997; Liu, 2003; Liu, Lo, & Wu, 1996; Politzer, 2005; Stevenson
& Over, 1995), we have recently been arguing for an account of conditional
inference based on inductive strength (Oaksford & Chater 2007; Oaksford &
Chater, 1998, 2001, 2003a–d; Oaksford, Chater, & Larkin, 2000). The main
assumption underlying this approach, as we discussed in the introduction and
in the last section, is that the probability of a conditional is the conditional
probability, that is, see (7) above. This assumption is at the heart of Adams’s
(1975, 1998) account of the probability conditional. However, in that account
the Ratio formula (8) is not definitional of conditional probability. People’s
understanding of P (q | p) is given by the subjective interpretation provided
by the Ramsey test. As Bennett (2003) says:
The best definition we have [of conditional probability] is the one provided
by the Ramsey test: your conditional probability for q given p is the prob-
ability for q that results from adding P ( p) = 1 to your belief system and
conservatively adjusting to make room for it (p. 53).
The Ratio formula indicates that in calculating the probability of the every-
day conditional, all that matters are the probabilities of the true antecedent
cases, that is, P ( p ∧ q ) and P ( p ∧ ¬q )[P ( p) = P ( p ∧ q ) + P ( p ∧ ¬q )].
This contrasts immediately with standard logic where the probability of a
conditional is given by the probability of the cases that make it true, that
is, P ( p → q ) = P (¬ p) + P ( p, q ). Evans et al. (2003) and Oberauer and
Wilhelm (2003) have confirmed that people interpret the probability of the
8 Although, as we discuss later (in the paragraph before Section 2.1), this is not the only possible
response. In some non-classical logics, the paradoxes are also not valid inferences.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
9 These considerations place constraints on the way conditionals can be embedded in truth-
functional compound sentences (Lewis, 1976). However, there is linguistic evidence that the
constraints on embeddings recommended by the probability conditional may be reflected in
natural languages (Bennett, 2003). Bennett (2003) argues that this linguistic fact seems to disarm
some of the strongest criticisms of the Equation due to Lewis (1976).
This thesis may also seem to violate intuitions that conditionals such as “if a steam roller
runs over this teacup, it will break” and “if Lincoln visited Gettysburg, then he delivered a
speech there” are determinately true. The issues highlighted by examples like these relate to how
conditionals and the predicate “true” are used in natural language. It is possible that the teacup
does not break (as it is made of titanium, i.e., it just bends a bit). Moreover, only someone who
knew Lincoln gave the Gettysburg address would be in a position to affirm this conditional,
but, given this knowledge, why would they ever have grounds to assert this conditional? “True”
might also be used in natural language to mean a probability infinitesimally different, but still
different, from 1 (Pearl, 1988).
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
All p-valid arguments are classically valid, but not all classically valid ar-
guments are p-valid. For example, as we have seen, the paradoxes of material
implication (6) are not p-valid, that is, they can be inductively weak. Strength-
ening of the antecedent (3) also turns out not to be p-valid. Although it makes
sense to assign a high probability to if Tweety is a bird, then Tweety can fly,
because most birds fly, anyone would assign a very low probability, that is, 0,
to if Tweety is a bird and Tweety is one second old, then Tweety can fly (how
old Tweety is can be adjusted to produce probabilities for the conclusion
that are greater than 0). Thus the uncertainty of the conclusion, that is, 1, is
greater than the uncertainty of the premise – it is an inductively weak argu-
ment. So the probability conditional is defeasible or non-monotonic at its very
foundation.
P -validity holds for an argument form or schemata. For example, if it is
p-valid, there can be no instances where (9) is violated. Inductive strength
applies to particular arguments, that is, content matters. So for example, if
“is one second old” were replaced with “is green,” then this instance of strength-
ening the antecedent might be viewed as inductively strong.10 Strengthening
of the antecedent nonetheless is not p-valid because (9) does not hold for
all instances of this inference, that is, (9) does not hold in all models of the
premises. An instance of a p-valid inference may also be inductively weak. For
example, in a modus ponens inference, someone may be maximally uncertain
about whether birds fly and whether Tweety is a bird, that is, the probability
of both premises is .5. As we will see later, the probability of the conclusion is
therefore .25. That is, although (9) is not violated, this is an inductively weak
argument, that is, the probability of the conclusion given the premises is low.
However, when it comes to conditionals, p-validity is perhaps not that
interesting a concept, that is, it is useful only in the static case of fixed prob-
ability distributions. Conditionals in probability logic can also be looked at
dynamically, that is, they allow people to update their beliefs (prior probabil-
ity distributions) given new information. So if a high probability is assigned
to if x is a bird, x flys, then on acquiring the new information that Tweety is
a bird, one’s degree of belief in Tweety flys should be revised to one’s degree
of belief in Tweety flys given Tweety is a bird, that is, my degree of belief in the
conditional. So using P0 to indicate prior degree of belief and P1 to indicate
posterior degree of belief, then:
10 Although as “is green” seems totally irrelevant to the whether birds fly, this argument feels
circular, that is, the conclusion is just a restatement of the premises. We look at argumentative
fallacies, like circularity, in relation to an account of argument strength in Section 4.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
ab
AC P1 ( p) = P0 ( p|q ) = (13)
c
1 − c − (1 − a)b
MT P1 (¬ p) = P0 (¬ p|¬q ) = (14)
1−c
Equations (11) to (14) show the posterior probabilities of the conclusion of
each inference, assuming the posterior probability of the categorical premise
is 1. By using Jeffrey conditionalization, these cases can be readily generalized
to when the probability of the categorical premise is less than 1. For example,
for MP, assuming P1 ( p) = d:
MP P1 (q ) = P0 (q | p)P1 ( p) = ad (15)
(15) actually represents a lower bound on the posterior probability, with the
upper bound at ad + 1 – d.
In studies investigating all four inferences, a significant asymmetry has been
observed between modus ponens (1) and modus tollens (2), with (1) being
endorsed considerably more than (2). Moreover, affirming the consequent
is endorsed more than denying the antecedent. In most other psychological
accounts, this latter asymmetry is explained as an instance of the asymmetry
between (1) and (2) on the assumption that some people believe that if p
then q pragmatically implicates its converse, if q then p. Oaksford, Chater, and
Larkin (2000; Oaksford & Chater, 2003a, b) showed that as long P0 (q | p) is
less than 1, then (11) to (14) can generate these asymmetries. The precise
magnitude of the asymmetries depend on the particular values of P0 (q|p),
P0 ( p), and P0 (q ).
However, no values of these probabilities are capable of generating the em-
pirically observed magnitudes of the MP–MT and AC–DA asymmetries –
the best fit values underestimate the former and overestimate the latter.
Oaksford & Chater (2007) argued that this is related to perceived failures
of what is called the rigidity assumption for conditionalization, which is that
P1 (q | p) = P0 (q | p) (16)
This assumption is implicit in (15). This means that learning that the cate-
gorical premise has a particular posterior value does not alter the conditional
probability. However, consider the following example for MT, where the con-
ditional premise is if you turn the key the car starts and the categorical premise
is the car does not start. What is odd about the assertion of the categorical
premise is that it seems informative only in a context where the car was ex-
pected to start, and the only way one would normally expect a car to start
was if the key had been turned. Consequently, the assertion of the categorical
premise raises the possibility of a counterexample to the conditional.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
which in turn reduces the posterior degree of belief in the conclusion. Oaks-
ford and Chater (2007) generalise this account to DA and AC and show how
it provides a much better account of the MP–MT and AC–DA asymmetries.
For example, using the same figure as in the above example, AC should be en-
dorsed at 59% and DA at 51%. In sum, evaluating conditional inferences for
inductive strength rather than deductive correctness gets round paradoxical
problems for the standard logic of the conditional. Moreover, it provides a bet-
ter account of the data by invoking inductive processes of Bayesian updating
to cope with rigidity violations.
11 These studies all involved abstract material. Consequently, it could be argued that people have
no prior beliefs to bring to bear. However, the conversational pragmatics of the situation, where
the conditional is asserted, immediately suggest, via the Ratio formula, that P0 (q | p) is high and
so P0 ( p) is non-zero. In modelling these data Oaksford & Chater (2007) only assumed that
P0 ( p) is maximally uncertain, that is, .5. Moreover, in general we believe that prior knowledge is
always bought to bear analogically even on abstract materials, that is, perhaps only a small subset
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
of exemplars of conditional relationships are accessed to provide rough and ready estimates of
some of relevant parameters.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
experiments had the poorest fits to the data and were technically outliers. With
these experiments removed, the fit, measured by the co-efficient of variation
R 2 , did not differ significantly between the studies where AC or DA were
stronger than MT (mean R 2 = .92, SD = .07) and the rest (mean R 2 = .93,
SD = .08), t(56) = .57, p = .57.
It could also be argued that these results are not surprising because in other
theories, like mental models (Johnson-Laird & Byrne, 2002) or versions of
mental logic (Rips, 1994), AC can be drawn more often than MT. In these
theories, this happens if the task rule is interpreted as a bi-conditional, in
which case all four inferences are valid. However, if p then q does not logically
entail if and only if p then q. Moreover, the simple alphanumeric stimuli
used in the experiments in Schroyens and Schaeken’s (2003) meta-analysis
do not provide an appropriate context for the bi-conditional interpretation.
Consequently, if people do interpret the rules in this way, they are still in
error. There are some studies for which DA is drawn at least as often, if not
more often, than MT, although, within a study, we doubt whether this effect
was ever significant. This data can be interpreted as showing that some logical
reasoning fallacies are regarded as at least as strong as deductively correct
inferences (although the data is also consistent with the interpretation that
people erroneously assume that the conditional implicates its converse). In the
probabilistic account we introduced in the last section, DA can be endorsed
more than MT when P0 ( p) is greater than P0 (q ), which is probabilistically
consistent as long as P0 (q | p) is not too high.
In this section, we have discussed alternative accounts of argument strength
to which we think people may be sensitive. However, the appropriate exper-
iment to judge between appropriate measures has not been conducted. This
is because in standard conditional reasoning experiments, people are asked
to judge only their absolute rate of endorsement of the conclusion, which
is a function of both the prior, P (conclusion), and the amount of change,
P (conclusion|premises) – P (conclusion), brought about by an argument. How-
ever, the analysis in this section did not show much discrimination between a
measure of change, the likelihood ratio, and P (conclusion|premises). This will
require experiments where change is explicitly addressed and where people
are not just asked how much do you endorse a conclusion? but also how good
an argument is it for you?
We also observed that there are experiments where the argument strength
associated with a reasoning fallacy was as high as that associated with a de-
ductively correct inference. This is at least suggestive of a rather extreme
dissociation between deductive correctness and argument strength that goes
beyond that observed by Rips (2001). In Rips’s data, no deductively incorrect
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
Ghosts exist, because nobody has proven that they don’t. (19)
This argument does indeed seem weak, and one would want to hesitate in
positing the existence of all manner of things whose non-existence simply had
not been proven, whether these be UFO’s or flying pigs with purple stripes.
However, is it really the general structure of this argument that makes it weak,
and if so what aspect of it is responsible? Other arguments from negative
evidence are routine in scientific and everyday discourse and seem perfectly
acceptable:
This drug is safe, because no one has found any side effects. (20)
Drug A is not toxic because no toxic effects were observed (negative argu-
ment, i.e., the argument from ignorance). (22)
Though (22) too can be acceptable where a legitimate test has been performed,
that is,
nh
P (T | e)= (23)
nh + (1 − l )(1 − h)
l (1 − h)
P (¬T|¬e)= (24)
l (1 − h) + (1 − n)h
Sensitivity corresponds to the “hit rate” of the test and 1 minus the selectivity
corresponds to the “false positive rate.” There is a trade-off between sensitivity
and selectivity which is captured in the receiver-operating characteristic curve
(Green & Swets, 1966) and which plots sensitivity against the false positive
rate (1 – selectivity). Where the criterion is set along this curve will determine
the sensitivity and selectivity of the test.
Positive test validity is greater than negative test validity as long as the
following inequality holds:
12 This means that people’s criterion is set such that positive evidence counts more for a hypothesis
than negative evidence counts against it.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
.
Rips (2001, 2002) suggested that recent proposals in the psychology of de-
ductive reasoning seem to imply reasoning imperialism. That is, proponents
of the view that people evaluate even deductive arguments for inductive or
argument strength are interpreted as seeing no role for deductive correct-
ness (e.g., Oaksford & Chater, 1998). Moreover, some researchers seem to be
committed to the opposite point of view that deductive correctness can pro-
vide an account of inductive strength (Johnson-Laird, 1994; Johnson-Laird,
Legrenzi, Girotto, Legrenzi, & Caverni, 1999). While we can not reply for the
latter imperialists, we have attempted to reply for the former. However, as
we have pointed out, we do not view ourselves as arguing for an imperialist
position, and in concluding we will point out why. But first we summarise
what we hope has been learnt from this discussion.
We first addressed Rips’s (2001) paper. We showed that the data are perhaps
not as convincing as they may first seem, because the examples of deductively
incorrect and inductively strong arguments had no logical structure and yet
up to 35% of participants identified them as deductively correct. Moreover, we
argued that establishing the claim of sensitivity to deductive correctness can be
established only for a whole logic and not by examining isolated inferences. We
therefore looked at some other simple inferences that are valid in propositional
logic, the implicit standard in Rips (2001), and showed that some are para-
doxical, that is, they lead to intuitively unacceptable conclusions. That is, one
would not want to reason according to this standard. We diagnosed the prob-
lem as a mismatch between argument strength and deductive correctness –
if argument strength can be low for a deductively correct argument, it may be
viewed as paradoxical. This is prima facie strong evidence for the primacy of
argument strength.
We then looked at conditional inference in particular. We showed that
probability logic (Adams, 1998) uses an account of inductive strength which
resolves many of the paradoxes observed for standard deductive validity.
Moreover, we showed how a dynamic view of inference in probability logic
is consistent with Oaksford et al’s. (2000) probabilistic account. This account
also provides a better account of the data by invoking inductive processes
of updating beliefs in the light of possible counterexamples. The possibil-
ity of counterexamples arises because of possible violations of the rigidity
condition for conditionalization. We also evaluated the inductive strength of
some conditional argument forms and showed that people judge some con-
ditional reasoning fallacies to be as strong as a deductively correct inference.
This should not be possible according to the account Rips defends based on
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
standard logic where all deductively correct inferences must have maximal
argument strength, that is, P (conclusion|premises) = 1.13
We finally showed that the concept of argument strength can be applied
much more generally to the informal argument fallacies, such as the argument
from ignorance, to resolve the paradox of why some instances of these fallacies
seem perfectly acceptable and capable of rationally persuading someone of
a conclusion. This again provides a strong argument for the centrality of
argument strength in human reasoning and argumentation.
However, as we said in the introduction to the chapter, the upshot of
all these arguments is not to deny that people possess some understand-
ing of deductive validity, but rather to establish that in determining hu-
man inferential behaviour, argument strength is the more central con-
cept. In AI knowledge representation, technical problems in developing
tractable non-monotonic logics that adequately capture human inferen-
tial intuitions has provided the impetus for the development of alterna-
tive approaches that explicitly include some measure of the strength of an
argument (Gabbay, 1996; Fox, 2003; Fox & Parsons, 1998; Pollock, 2001;
Prakken & Vreeswijk, 2002). It is this inclusion that provides these sys-
tems with their nice default properties. These systems often address un-
certainty and argument strength qualitatively rather than use the probabil-
ity calculus. In some accounts (e.g., Pollock, 2001) argument strengths are
assumed to range from 0 to ∞, whereas in others they are treated as qual-
itative categories with varying numbers of bins, for example, ++ , +, –, – –
(Fox & Parsons, 1998; see also some of the work reviewed in Prakken &
Vreeswijk, 2002). In the latter case, argument strengths can combine like
multivalued truth tables. Some of these rules for combining qualitative argu-
ment strengths are consistent with the probability calculus (Fox & Parsons,
1998). One important aspect of these approaches is that most use logical
entailment to determine whether a proposition is relevant to another propo-
sition and so can transmit an element of argument strength from one to the
other. These accounts illustrate an important fact that to build practical sys-
tems that track argument strength across complex arguments requires some
account of structure.
These systems also raise other interesting questions that reasoning re-
searchers probably need to address. For example, some of these systems are
not consistent with the probability calculus. The reason is the apparently
13 The reason this is not the case in probability logic is that the moment the Equation is accepted,
one is outside the realm of classical or standard logic, in which the probability of a conditional
is P(¬p) + P(p, q).
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
References
Adams, E. W. (1975). The logic of conditionals. Dordrecht: Reidel.
Adams, E. W. (1998). A primer of probability logic. Stanford, CA: CSLI Publications.
Anderson, A. R., & Belnap, N. D. (1975). Entailment: The logic of relevance and necessity,
Vol. 1, Princeton: Princeton University Press.
Anderson, J. R. (1995). Cognitive psychology and its implications. New York: W. H.
Freeman.
Bennett, J. (2003). A philosophical guide to conditionals. Oxford: Oxford University Press.
Bishop, Y. M. M., Feinberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis.
Cambridge, MA: MIT Press.
Chan, D., & Chua, F. (1994). Suppression of valid inferences: Syntactic views, mental
models, and relative salience. Cognition, 53, 217–238.
Chater, N., & Oaksford, M. (1996). The falsity of folk theories: Implications for psy-
chology and philosophy. In W. O’Donohue & R. F. Kitchener (Eds.), The philosophy
of psychology (pp. 244–256). London: Sage Publications.
Chater, N., & Oaksford, M. (1999). The probability heuristics model of syllogistic rea-
soning. Cognitive Psychology, 38, 191–258.
Clark, K. L. (1978). Negation as failure. In H. Gallaire & J. Minker (Eds.), Logic and
databases (pp. 293–322). New York: Plenum Press.
Copi, I. M., & Cohen, C. (1990). Introduction to logic (8th Ed.). New York: Macmillan
Press.
Dhami, M. K., & Wallsten T. S. (2005). Interpersonal comparison of subjective proba-
bilities: Towards translating linguistic probabilities. Memory and Cognition, 33, 1057–
1068.
Edgington, D. (1991). The matter of the missing matter of fact. Proceedings of the
Aristotelian Society, Suppl. Vol. 65, 185–209.
Evans, J. St. B. T., & Over, D. E. (2004). If. Oxford: Oxford University Press.
Evans, J. St. B. T., Handley, S. J., & Over, D. E. (2003). Conditionals and conditional
probability. Journal of Experimental Psychology: Learning, Memory, & Cognition, 29,
321–335.
Enderton, H. (1972). A mathematical introduction to logic. New York: Academic
Press.
Fox, J. (2003). Probability, logic and the cognitive foundations of rational belief. Journal
of Applied Logic, 1, 197–224.
Fox, J., & Parsons, S. (1998). Arguing about beliefs and actions. In A. Hunter and
S. Parsons (Eds.), Applications of uncertainty formalisms (Lecture notes in artificial
intelligence 1455) (pp. 266–302). Berlin: Springer Verlag.
Gabbay, D. (1996). Labelled deduction systems. Oxford: Oxford University Press.
George, C. (1997). Reasoning from uncertain premises. Thinking and Reasoning, 3,
161–190.
Green, D. M., & Swets, J. A., (1966). Signal detection theory and psychophysics. New York:
Wiley.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
Hahn, U., & Oaksford, M. (2006). A Bayesian approach to informal argument fallacies.
Synthese, 152, 207–236.
Hahn, U., Oaksford, M., & Bayindir, H. (2005). How convinced should we be by neg-
ative evidence? In B. Bara, L. Barsalou, and M. Bucciarelli (Eds.), Proceedings of the
27th Annual Conference of the Cognitive Science Society (pp. 887–892), Mahwah, NJ:
Lawrence Erlbaum.
Hahn, U., Oaksford, M., & Corner, A. (2005). Circular arguments, begging the question
and the formalization of argument strength. In A. Russell, T. Honkela, K. Lagus, & M.
Pöllä, (Eds.), Proceedings of AMKLC’05, International Symposium on Adaptive Models
of Knowledge, Language and Cognition (pp. 34–40), Espoo, Finland, June 2005.
Hamblin, C. L. (1970). Fallacies. London: Methuen.
Hattori, M., & Oaksford, M. (in press). Adaptive non-interventional heuristics for co-
variation detection in causal induction: Model comparison and rational analysis.
Cognitive Science.
Heit, E., & Rotello, C. M. (2005). Are there two kinds of reasoning? In B. Bara, L.
Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th Annual Conference of the
Cognitive Science Society (pp. 923–928). Mahwah, NJ: Lawrence Erlbaum.
Hughes, G. E., & Cresswell, M. J. (1968). An introduction to modal logic. London:
Methuen.
Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Hove, Sussex: Lawrence
Erlbaum.
Johnson-Laird, P. N., & Byrne, R. M. J. (2002). Conditionals: A theory of meaning,
pragmatics, and inference. Psychological Review, 109, 646–678.
Johnson-Laird, P. N., Legrenzi, P., Girotto, V., Legrenzi, M. S., & Caverni, J. P. (1999).
Naive probability: A mental model theory of extensional reasoning. Psychological
Review, 106(1), 62–88.
Lewis, D. (1973). Counterfactuals. Oxford: Blackwell.
Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosoph-
ical Review, 85, 297–315.
Liu, I. M. (2003). Conditional reasoning and conditionalisation. Journal of Experimental
Psychology: Learning, Memory, & Cognition, 29, 694–709.
Liu, I. M., Lo, K., & Wu, J. (1996). A probabilistic interpretation of “If-then.” Quarterly
Journal of Experimental Psychology, 49A, 828–844.
Massey, G. J. (1981). The fallacy behind fallacies. Midwest Studies in Philosophy, 6, 489–
500.
Nute, D. (1984). Conditional logic. In D. Gabbay & F. Guenthner (Eds.), Handbook of
philosophical logic, Vol. 2 (pp. 387–439). Dordrecht: Reidel.
Oaksford, M., & Chater, N. (1991). Against logicist cognitive science. Mind & Language,
6, 1–38.
Oaksford, M., & Chater, N. (1994). A rational analysis of the selection task as optimal
data selection. Psychological Review, 101, 608–631.
Oaksford, M., & Chater, N. (1996). Rational explanation of the selection task. Psycho-
logical Review, 103, 381–391.
Oaksford, M., & Chater, N. (1998). Rationality in an uncertain world: Essays on the
cognitive science of human reasoning. Hove, Sussex: Psychology Press.
Oaksford, M., & Chater, N. (2001). The probabilistic approach to human reasoning.
Trends in Cognitive Sciences, 5, 349–357.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
Oaksford, M. & Chater, N. (2002). Commonsense reasoning, logic and human rational-
ity. In R. Elio (Ed.), Common sense, reasoning and rationality (pp. 174–214). Oxford:
Oxford University Press.
Oaksford, M., & Chater, N. (2003a). Computational levels and conditional reasoning:
Reply to Schroyens and Schaeken (2003). Journal of Experimental Psychology: Learning,
Memory, & Cognition, 29, 150–156.
Oaksford, M., & Chater, N. (2003b). Conditional probability and the cognitive science
of conditional reasoning. Mind & Language, 18, 359–379.
Oaksford, M., & Chater, N. (2003c). Modeling probabilistic effects in conditional infer-
ence: Validating search or conditional probability? Revista Psychologica, 32, 217–242.
Oaksford, M., & Chater, N. (2003d). Probabilities and pragmatics in conditional infer-
ence: Suppression and order effects. In D. Hardman & L. Macchi (Eds.), Thinking:
Psychological perspectives on reasoning, judgment and decision making (pp. 95–122).
Chichester, UK: John Wiley & Sons.
Oaksford, M., & Chater, N. (2003e). Optimal data selection: Revision, review and re-
evaluation. Psychonomic Bulletin & Review, 10, 289–318.
Oaksford, M., & Chater, N. (2007). Bayesian rationality: The probabilistic approach to
human reasoning. Oxford: Oxford University Press.
Oaksford, M., Chater, N., & Larkin, J. (2000). Probabilities and polarity biases in condi-
tional inference. Journal of Experimental Psychology: Learning, Memory & Cognition,
26, 883–899.
Oaksford, M., & Hahn, U. (2004). A Bayesian approach to the argument from ignorance.
Canadian Journal of Experimental Psychology, 58, 75–85.
Oberauer, K., & Wilhelm, O. (2003). The meaning of conditionals: Conditional prob-
abilities, mental models, and personal utilities. Journal of Experimental Psychology:
Learning, Memory, & Cognition, 29, 321–335.
Osherson, D., Perani, D., Cappa, S., Schnur, T., Grassi, F., & Fazio, F. (1998). Distinct
brain loci in deductive vs. probabilistic reasoning. Neuropsychologica, 36, 369–376.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible infer-
ence. San Mateo, CA: Morgan Kaufman.
Perelman, C., & Olbrechts-Tyteca, L. (1969). The new rhetoric: A treatise on argumenta-
tion. Notre Dame, IN: University of Notre Dame Press.
Politzer, G. (2005). Uncertainty and the suppression of inferences. Thinking and Rea-
soning, 11, 5–34.
Pollock, J. L. (2001). Defeasible reasoning with variable degrees of justification. Artificial
Intelligence, 133, 233–282.
Prakken, H., & Vreeswijk, G. A.W. (2002). Logics for defeasible argumentation. In D.
M. Gabbay & F. Guenthner (Eds.), Handbook of philosophical logic, 2nd ed., Vol. 4
(pp. 219–318). Dordrecht: Kluwer Academic Publishers.
Quinn, S., & Markovits, H. (2002). Conditional reasoning with causal premises: Evidence
for a retrieval model. Thinking and Reasoning, 8, 179–19.
Rapoport, A., Wallsten, T. S., & Cox, J. A. (1987). Direct and indirect scaling of mem-
bership functions of probability phrases. Mathematical Modelling, 9, 397–417.
Rips, L. J. (1994). The psychology of proof. Cambridge, MA: MIT Press.
Rips, L. J. (2001a). Two kinds of reasoning. Psychological Science, 12, 129–134.
Rips, L. J. (2001b). Reasoning imperialism. In R. Elio (Ed.), Common sense, reasoning,
and rationality (pp. 215–235). Oxford University Press.
P1: JZP
05216772443c11 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:43
Rips, L. J. (2002). Reasoning. In H. F. Pashler (Series Ed.) and D. L. Medin (Vol. Ed.),
Stevens’ handbook of experimental psychology: Vol. 2. Cognition (3rd Ed.). New York:
Wiley.
Schroyens, W., & Schaeken, W. (2003). A critique of Oaksford, Chater, and Larkin’s
(2000) conditional probability model of conditional reasoning. Journal of Experimen-
tal Psychology: Learning, Memory, & Cognition, 29, 140–149.
Schroyens, W. J., Schaeken, W., & Handley, S. (2003). In search of counter-examples: De-
ductive rationality in human reasoning. Quarterly Journal of Experimental Psychology,
56A, 1129–1145.
Scott, D. (1971). On engendering an illusion of understanding. The Journal of Philosophy,
68, 787–807.
Skyrms, B. (1975). Choice and chance (2nd Ed.). Belmont, CA: Wadsworth.
Stalnaker, R. C. (1968). A theory of conditionals. In N. Rescher (Ed.), Studies in Logical
Theory (pp. 98–113). Oxford: Blackwell.
Stevenson, R. J., & Over, D. E. (1995). Deduction from uncertain premises. Quarterly
Journal of Experimental Psychology, 48A, 613–643.
Veltman, F. (1985). Logics for conditionals. PhD. Thesis, Faculteit der Wiskunde en
Natuurwetenschappen, University of Amsterdam.
Wallsten, T. S., Budescu, D. V., Rapoport, A., Zwick, R., & Forsyth, B. (1986). Measuring
the vague meanings of probability terms. Journal of Experimental Psychology: General,
115, 348–365.
Walton, D. N. (2004). Relevance in argumentation. Mahwah, NJ: Lawrence Erlbaum.
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
12
In this chapter I hope to demonstrate that answering the question “who does
what in reasoning experiments?” can also help us answer fundamental ques-
tions about the nature of thought. Because it is nearly always the case that
some experimental participants display phenomena of interest and others
don’t, almost every experiment run by cognitive psychologists produces data
on individual differences. But, as most psychologists do not wish to take dif-
ferences between their participants as the starting point of their investigations,
individual differences tend to be ignored. Thus, the means that experimental
psychologists report when they describe their data abstract across differences
between individuals, with only the standard deviation or standard error in-
dicating the extent to which participants varied in their responses. In this
chapter I will address what the study of differences between individuals might
tell us about a range of issues in inductive reasoning, including what consti-
tutes a good inference, what processes underlie induction, and how induction
differs from deduction.
My primary focus will be on the general question of how individual-
differences data support dual-process theories of thinking (Evans & Over,
1996; Stanovich, 1999; Sloman, 1996). Dual-process theories have been ap-
plied to deduction, decision making, and induction. I will outline the dual-
process approach and describe evidence in its favour, paying particular atten-
tion to individual differences. I will also describe how individual-differences
methodology has been used to arbitrate between different normative ac-
counts of specific thinking tasks (see Stanovich & West, 1998b, c). With this
background covered I will consider what individual differences might tell us
about category-based induction. In particular, what can the method tell us
above and beyond more standard experimental methods about people’s in-
ductive reasoning abilities? Does the method have anything to say about how
a dual-process approach might be applied to induction across categories? And
302
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
can it inform the debate about the normative status of inductive phenom-
ena? I will describe the results of two studies designed with these questions
in mind.
I will conclude with a consideration of other questions about induction
to which the individual-differences methodology might help provide some
answers. I will briefly describe a recent study designed to investigate who
is susceptible to inductive reasoning fallacies, that is, judgements about the
strength of inductive arguments that depart from the prescriptions of some
normative theory of induction. In addition, I will consider recent claims
about dissociations between induction and deduction (Rips, 2001; Rips and
Asmuth, Chapter 10, this volume; Oaksford and Hahn, Chapter 11, this
volume; Heit & Rotello, 2005). I will outline a dual-process perspective on
this dissociation and argue that an individual-differences approach allows
one to test the dual-process account. First, however, as dual-process theories
of thinking have motivated much of the individual differences work I will
describe, it is important to review them in some detail here.
304 A. Feeney
with age. This makes sense if logical or Type-2 thinking is associated with
working memory and IQ, both of which decline as we get older (e.g.,
Hasher & Zacks, 1988; Horn & Cattell, 1967). In another line of investigation
it has been shown that susceptibility to belief bias on conflict problems may be
reduced by stressing the concept of logical necessity in the instructions (Evans,
Allen, Newstead, & Pollard, 1997). This result is predicted because of the con-
trolled nature of Type-2 thinking processes. Further evidence for the dual-
process interpretation comes from a recent study by Evans and Curtis-Holmes
(2005) using a speeded response paradigm (for a related use of this paradigm
see Shafto, Colony, and Vitkin, Chapter 5 this volume). When participants
were required to make a speeded response, they were more susceptible to the
effects of belief in conflict problems. To summarise then, responses based on
Type-1 processes are primary, but responses based on Type-2 processing can
dominate if participants are instructed to suppress Type-1 outputs, if they have
sufficient time for the slower Type-2 processes to operate, or if they are young.
Before proceeding it is worthwhile to consider two other recent sources
of particularly compelling evidence. Goel and Dolan (2003) have reported
the results of an MRI study of belief bias in which they found very strong
evidence for separate types of thinking. When participants gave belief-based
responses on conflict problems, researchers observed increased activation in
ventral medial prefrontal cortex (VMPFC). When these same participants
gave logical responses on conflict problems, the researchers observed in-
creased activation in right inferior prefrontal cortex. Goel and Dolan argue
that VMPFC is associated with the effects of beliefs in reasoning (see also
Adolphs, Tranel, Bechara, Damasio, & Damasio, 1996), whereas activation in
inferior prefrontal cortex is due to the inhibition of a response based on Type-1
thinking processes in favour of a Type-2 or logical response. The notion that
inhibition is important to logical reasoning is further supported by some
data on children’s reasoning reported by Handley, Capon, Beveridge, Dennis,
and Evans (2004). These authors found that scores on the conflict index in
children aged ten years were correlated with performance on a task designed
to measure ability to inhibit a prepotent response. Children who were more
likely to respond logically had higher inhibitory control. Both of these studies
suggest the existence of two systems for reasoning, one of which is dominant
and whose effects must be suppressed by the other.
Handley et al.’s study is interesting in the current context because it
employed an individual-differences methodology. However, the individual-
differences variable measured was response inhibition and, although there is
currently considerable interest in the relationship between measures of inhi-
bition and working memory (see Kane & Engle, 2003), there has been little
other work on inhibition and thinking. By far the most popular individual
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
306 A. Feeney
tasks. Stanovich (1999, pg. 143) claims that large differences in performance
due to cognitive ability will be found only for thinking tasks that engage both
systems. When a thinking task engages the associative system only, then no
difference will be observed. In addition, individual differences are more likely
to be found when the systems produce conflicting responses.
Stanovich illustrates this argument with reference to performance on de-
ontic and indicative versions of Wason’s selection task (see Stanovich & West,
1998b). In the selection task participants are presented with a conditional
rule that is said to govern a set of instances. In standard indicative versions
of the task, the rule governs what is printed on each side of sets of cards. For
example, “If there is an A on one side of the card (p), then there is a 3 on
the other side (q).” Participants are shown four cards. On the visible sides are
printed an A (p), an L (not-p), a 3 q), and a 7 (not-q). Participants are asked
which card or cards need to be turned over to test the rule. Deontic versions of
the selection task concern what should or ought to be done. Most often, they
concern social laws, obligations, or permissions. For example, “If someone is
drinking alcohol in a bar (p) then they must be over 18 (q)”. Printed on the
visible side of the cards in this example might by “Beer” (p), “Coke” (not-p),
“21” (q), and “16” (not-q).
One of the earliest findings in the literature on the indicative selection
task (see Wason, 1966) was that people tended not to select the cards that
could falsify the rule (the p and the not-q cards in the example above).
However, when deontic content is included (see Johnson-Laird, Legrenzi, &
Legrenzi, 1972), people regularly select the falsifying cases. There are many
explanations of why people behave differently on the indicative and deontic
tasks (see Cheng & Holyoak, 1985; Cosmides, 1986; Manktelow & Over,
1991; Oaksford & Chater, 1994). Stanovich argues that most explanations
suggest that people’s selections on the deontic task are not determined by
conscious, analytic reasoning but by largely unconscious, heuristic processes.
Correct responding on the indicative task, on the other hand, requires that
the analytic system overcome the unconscious associative system which, in
most participants, is responsible for the selection of those cards that match
the items in the rule.
If Stanovich is right, then correct performance on the indicative task should
be associated with cognitive ability whereas correct performance on the de-
ontic task should not. Stanovich and West (1998b) confirmed this predic-
tion. They found clear evidence that people who reported higher SAT results
were more likely to select the potentially falsifying cards on the indicative
selection task. There was much less evidence for such an association on
the deontic task. Broadly similar results have been reported by Newstead,
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
308 A. Feeney
Handley, Harley, Wright, and Farrelly (2004), who also found that whether
associations between ability and performance on deontic and indicative tasks
are found depend on the ability range of the participants sampled.
The second argument that Stanovich uses to motivate the individual-
differences approach relates to what counts as the correct normative response
on any particular task. There has been substantial debate about which is
the correct normative account of tasks as varied as Kahneman and Tversky’s
(1983) Linda problem and Wason’s selection task (see Cohen, 1981; Oaksford
& Chater, 1994). Stanovich and West (1998b, c) argued that an association
between cognitive ability and certain patterns of performance on these tasks
might favour one normative account over another (see also Cohen, 1981).
For example, Oaksford and Chater (1994) have recast Wason’s selection task
as one of deciding which card or cards is likely to be most informative. From
a Bayesian perspective, given certain assumptions about the probability of
items mentioned in the conditional rule, it turns out that the most informa-
tive cards on the indicative task are the p and q cards. Oaksford and Chater
argue that their Bayesian reanalysis of the task is closer to a correct normative
analysis than is the Popperian notion of falsification, which claims that the
“correct” solution is to select the p and not-q cards.
To arbitrate between these competing normative accounts, Stanovich and
West looked for associations between cognitive ability and the tendency to
select cards in accordance with each of these distinct normative approaches.
They found that more able participants were more likely to select the falsifying
cards than were the less able participants. Less able participants, on the other
hand, were more likely to select the p and q cards. Stanovich concludes that
the individual-differences data suggest that from a normative point of view,
it is more appropriate to think about the indicative selection task as one
requiring falsification rather than decisions about expected information gain.
Of course this argument may not always hold. For example, if one happens
to sample from the bottom of the range of ability scores, then the most able
participants in a particular sample may very well perform in non-normative
ways on a particular thinking task (see Newstead et al., 2004). However, if
one consistently finds, across a range of samples and abilities, that the most
able participants consistently favour one response over another, then this
lends support to the claim that the response favoured by these participants is
the one that is normatively correct.
Unfortunately, although Stanovich and colleagues have investigated a num-
ber of phenomena (including belief bias), very little data has been pub-
lished on the relationship between inductive reasoning and general cog-
nitive ability. One exception is work by Stanovich and West (1998a) on a
task where participants make a decision in the face of conflicting evidence
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
310 A. Feeney
Dual-Process Predictions
Using an individual-differences approach to study diversity will also enable us
to give preliminary consideration to how a dual-process view might be applied
to category-based induction. Certainly, at least one theory of induction makes
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
312 A. Feeney
that for arguments like 6 and 7, where the conclusion category is specific
rather than general, in order to display sensitivity to diversity, people must
generate the lowest level superordinate category that contains the categories
in the premises and the conclusion.
Bears have property X Argument 6
Gorillas have property X
Tigers have property X
When the conclusion in the argument is general, however, as it is in Argument
4, then people do not have to generate a covering category.
Bears have property X Argument 7
Mice have property X
Tigers have property X
The need to generate a covering category has been implicated by López,
Gelman, Gutheil, and Smith (1992) as an important factor in the develop-
ment of category-based induction. Lopez et al. found developmental trends
where sensitivity to diversity emerged for general arguments before specific
arguments. An individual-differences study will allow for an alternative test
of Lopez’s argument. If generating a covering category is an additional source
of difficulty, then sensitivity to diversity for specific arguments should be
related to cognitive ability. In addition, we should observe less sensitivity to
diversity when the conclusion category is specific rather than general. Perhaps
we might also expect to observe a weaker relationship between sensitivity to
diversity and ability with specific conclusions, as only the most able people in
any sample may display such sensitivity at above chance levels.
The relevance account of category-based induction (Medin, Coley, Storms,
& Hayes, 2003) attributes diversity effects to participants’ hypotheses about
the nature of the blank feature. Although they focus on the roles played by
causal and environmental knowledge in producing exceptions to diversity,
when no such knowledge is available, participants will use the degree of
similarity between the categories in the premises to guide their hypothesis
formation. Dissimilar categories lead to hypotheses that the feature is general
and hence highly projectible, whereas similar categories lead to hypotheses
about specific and hence non-projectible features (see also Heit, 1998).
The relevance account appears to give no grounds for dual-process predic-
tions in its current form. However, the relevance theory of linguistic prag-
matics (Sperber & Wilson, 1986/1995), upon which the relevance account is
based, uses logical rules for non-demonstrative inference (see Sperber &
Wilson, 1995, chapter 2). Thus, people work out the implications of an
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
314 A. Feeney
316 A. Feeney
possibility has also been suggested by McDonald, Samuels, and Rispoli (1996)
and, as we noted earlier, is consistent with certain applications of relevance
theory to category-based induction.
Finally, there are two ways in which category-based inductive phenom-
ena generally, and the diversity effect specifically, differ from other reasoning
and decision-making phenomena studied by dual-process theorists. First,
category-based phenomena require people to draw on their extensive knowl-
edge of categories in the world. It is possible, therefore, that our individual-
differences findings may be due to people high in cognitive ability having
richer representations of categories than people lower in ability. This would
mean that their associative systems had more effective knowledge bases to
make inferences over, leading to greater sensitivity to diversity. Whilst this
counter-explanation does not apply to other dissociations between Type-1
and Type-2 processes, it is an important one in this context. However, in
the case of diversity, this counter-explanation seems to boil down to the un-
likely possibility that some undergraduate students have insufficient featural
knowledge to distinguish similar from dissimilar mammals.
Second, Stanovich (1999) argues that associations between cognitive abil-
ity and performance on particular tasks will be strongest when different
responses are cued by Type-1 and Type-2 processes. As sensitivity to diversity
is predicted by the associative account (Sloman, 1993) and a range of other
more rule-based accounts (e.g., Osherson et al., 1990; Heit, 1998), the task
we have employed here is, in this respect, unlike other tasks previously used
to dissociate types of thinking. Nonetheless, we have observed significant
associations between sensitivity to diversity and cognitive ability. This obser-
vation suggests that sensitivity to diversity is more likely when there is greater
involvement of Type-2 processes.
Normative Issues
There exist several cases in the literature on induction where the findings
predicted by a particular normative account are not observed, or, alternatively,
phenomena are observed that are not predicted by the normative account.
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
318 A. Feeney
320 A. Feeney
(a)
5.5
5
Mean Rating
Two conclusion
4.5 One conclusion A
One conclusion B
4
3.5
LA HA
Ability
(b)
5.5
5 Two conclusion
Mean Rating
One conclusion
4.5
Close
One conclusion
4
Distant
3.5
LA HA
Ability
figure 12.1. Interaction between Ability and Conclusion Type from Feeney, Shafto, and
Dunning (in press).
322 A. Feeney
invalid but causally strong; invalid and causally weak. An example of each of
these inferences is to be seen in Table 12.1. For valid but causally weak and
invalid but causally strong problems, Rips observed an interaction such that
the former were rated stronger than the latter under deductive instructions,
whereas under inductive instructions this pattern was reversed. Rips argued
that this finding constitutes a psychological dissociation between induction
and deduction.
A dual-process view, on the other hand, is that although there are distinct
psychological processes for thinking, these processes do not map onto in-
duction and deduction (see Sloman, 1996). According to a dual-process view,
both Type-1 and Type-2 processes may be involved to a greater or lesser extent
in any thinking task, inductive or deductive. When participants distinguish
between deductive and inductive reasoning instructions, they do so because
they have explicit knowledge about the difference between concepts such as
necessity and plausibility, and, given different instructions, Type-2 processes
allow them to apply different standards for evaluating arguments.
Support for the dual-process view comes from an experiment by Feeney,
Dunning, and Over (in preparation) where eighty-one students from the Uni-
versity of Durham completed a subset of Rips’s problems under induction
conditions. Their task was to decide whether the conclusion of each argu-
ment was plausible given the information in the premises. Causal strength
and validity were independently manipulated, and participants saw four argu-
ments of each type. They circled “strong” if they thought that the argument
was plausible and “not strong” if they thought that it was implausible. As
with other studies described in this chapter, participants in this study com-
pleted the AH4. Their mean score was 94.40 (S.D. = 12.94). We carried out
a median split on our data by cognitive ability. The mean score of the forty-
one participants in the low ability group was 84.24 (S.D. = 8.20), whilst
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
0.9
Proportion 'Strong' responses
0.8
0.7
High Ability
0.6
Low Ability
0.5
0.4
0.3
Valid Invalid
figure 12.2. Means involved in the interaction between Ability and Validity from Feeney,
Dunning, and Over (in submission).
the mean score of the forty participants in the high ability group was 104.8
(S.D. = 7.42).
A 2(Ability) × 2 (Validity) × 2(Strength) mixed design ANOVA on the
proportion of plausible judgements in each condition revealed significant
main effects of Validity, F(1, 79) = 114.59, MSe = .07, p < .001, and of
Strength, F(1, 79) = 172.45, MSE = .06, p < .001. Whilst the interaction
between Ability and Strength was non-significant, F(1, 79) = 1.05, MSE =
.06, p > .3, Ability did interact significantly with Validity, F(1, 79) = 5.14,
MSE = .07, p < .03. The means involved in this interaction are presented in
Figure 12.2, where it may be seen that high ability participants judge a greater
proportion of valid arguments to be strong than do low ability participants,
and they judge a greater proportion of invalid arguments to be “not strong”
than do low ability participants.
It is clear from Figure 12.2 that overall, logical validity plays an important
role in people’s judgements of inductive strength. However, what is most
striking is that validity plays a greater role in the judgements of participants
high in cognitive ability than in the decisions made by participants lower in
ability. These results are consistent with the claim that inductive judgements
are mediated by two types of process. One type of process is sensitive to
content and context, whereas the other is sensitive to structure. The first type
of process underlies effects of belief in reasoning, but because it does not
depend on cognitive capacity, effects of belief do not interact with cognitive
ability. The second type of process underlies our sensitivity to logical validity.
Because it is dependent on cognitive capacity, sensitivity to logical validity in
this study interacts with cognitive ability.
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
324 A. Feeney
In this chapter I have described how results from studies using the individual-
difference method have informed recent dual-process theorising about think-
ing. I have shown how the method may provide valuable insights into the pro-
cesses involved in induction. For example, sensitivity to evidential diversity is
associated with cognitive ability and, in the presence of a specific conclusion,
is almost impossible for all but the most able participants. These results sug-
gest that the diversity effect is not explicable in wholly associationist terms
and that the generation of a covering category is, as appears to be predicted by
the SCM, an extra source of difficulty in induction. The method also appears
useful in testing Bayesian accounts of inductive reasoning and dual-process
accounts of the relationship between induction and deduction.
Individual differences are but one of several methods for testing models of
induction, and as a consequence, individual-differences studies are unlikely
to lead to the formulation of an entirely novel theory of induction. However,
as the range of approaches described in this volume attests, there is currently
a broad range of theories of induction formulated at a number of levels of
explanation. One might argue that currently the field does not need new
theories so much as it does evidence with which to discriminate between
existing theories and approaches. Individual-differences studies are likely to
be a rich source of discriminating evidence.
Please address correspondence to Aidan Feeney, Applied Psychology, Univer-
sity of Durham Queen’s Campus, Thornaby, Stockton-on-Tess TS17 6BH,
United Kingdom, Email: [email protected]
References
Adolphs, R., Tranel, D., Bechara, A., Damasio, H., & Damasio, A. R. (1996). Neuropsy-
chological approaches to reasoning and decision-making. In A. R. Damasio (Ed.),
Neurobiology of decision-making. Berlin: Springer-Verlag.
Carnap, R. (1950). Logical foundations of probability. Chicago: University of Chicago
Press.
Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A
theoretical account of processing in the Raven Progressive Matrices test. Psychological
Review, 97, 404–431.
Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology,
17, 391–416.
Cohen, L. J. (1981). Can human irrationality be experimentally demonstrated? Behav-
ioral & Brain Sciences, 4, 317–370.
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how
humans reason? Studies with the Wason selection task. Cognition, 31, 187–276.
Evans, J. St. B. T. (2003). In two minds: Dual process accounts of reasoning. Trends in
Cognitive Sciences, 7, 454–459.
Evans, J. St. B. T., Barston, J. L., & Pollard, P. (1983). On the conflict between logic and
belief in syllogistic reasoning. Memory & Cognition, 11, 295–306.
Evans, J. St. B. T., & Curtis-Holmes, J. (2005). Rapid responding increases belief bias:
Evidence for the dual process theory of reasoning. Thinking and Reasoning, 11, 382–
389.
Evans, J. St. B. T., Newstead, S. E., Allen, J. L., & Pollard, P. (1994). Debiasing by
instruction: The case of belief bias. European Journal of Cognitive Psychology, 6, 263–
285.
Evans, J. St. B. T., & Over, D. E. (1996). Rationality and reasoning. Hove, UK: Psychology
Press.
Feeney, A. (in press). How many processes underlie category-based induction? Effects
of conclusion specificity and cognitive ability. Memory & Cognition.
Feeney, A., Dunning, D. & Over, D. E. (in preparation). Reason distinctions: An indi-
vidual differences study.
Feeney, A., Shafto, P., & Dunning, D. (in press). Who is susceptible to conjunction
fallacies in category-based induction? Psychonomic Bulletin & Review.
Fong, G. T., Krantz, D. H., & Nisbett, R. E. (1986). The effects of statistical training on
thinking about everyday problems. Cognitive Psychology, 18, 253–292.
Frey, M. C., & Detterman, D. K. (2004). Scholastic assessment or g? The relationship
between the Scholastic Assessment Test and general cognitive ability. Psychological
Science, 15, 373–378.
Fry, A. F., & Hale, S. (1996). Processing speed, working memory, and fluid intelligence:
Evidence for a developmental cascade. Psychological Science, 7, 237–241.
Gilinsky, A. S., & Judd, B. B. (1994). Working memory and bias in reasoning across the
life-span. Psychology and Aging, 9, 356–371.
Goel, V., & Dolan, R. J. (2003). Explaining modulation of reasoning by belief. Cognition,
87, B11–B22.
Handley, S. J., Capon, A., Beveridge, M., Dennis, I., & Evans, J. St. B. T. (2004). Working
memory and inhibitory control in the development of children’s reasoning. Thinking
and Reasoning, 10, 175–195.
Hasher, L., & Zacks, R. T. (1988). Working memory, comprehension, and aging: A review
and a new view. The Psychology of Learning and Motivation, 22, 193–225.
Heim, A. W. (1967). AH4 group test of intelligence [Manual]. London: National Founda-
tion for Educational Research.
Heit, E. (1998). A Bayesian analysis of some forms of inductive reasoning. In M. Oaksford
& N. Chater (Eds.), Rational models of cognition (pp. 248–274). Oxford: Oxford
University Press.
Heit, E., & Feeney, A. (2005). Relations between premise similarity and inductive
strength. Psychonomic Bulletin and Review, 12, 340–344.
Heit, E., Hahn, U., & Feeney, A. (2005). Defending diversity. In W. Ahn, R. Goldstone,
B. Love, A. Markman, & P. Wolff (Eds.), Categorization inside and outside of the
laboratory: Essays in honor of Douglas L. Medin (pp. 87–99). Washington DC: American
Psychological Association.
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
326 A. Feeney
Heit, E., & Rotello, C. (2005). Are there two kinds of reasoning? Proceedings of the Twenty-
Seventh Annual Conference of the Cognitive Science Society (pp. 923–928). Mahwah,
NJ: Erlbaum.
Hempel, C. G. (1966). Philosophy of natural science. Englewood Cliffs, NJ: Prentice Hall.
Horn, J. L., & Cattell, R. B. (1967). Age differences in fluid and crystallized intelligence.
Acta Psychologica, 26, 107–129.
Howson, C., & Urbach, P. (1993). Scientific reasoning: The Bayesian approach. Chicago:
Open Court.
Jepson, C., Krantz, D. H., & Nisbett, R. E. (1983). Inductive reasoning: Competence or
skill? Behavioral and Brain Sciences, 6, 494–501.
Johnson-Laird, P. N., Legrenzi, P., Girotto, V., Legrenzi, M., & Caverni, J.-P. (1999). Naive
probability: A mental model theory of extensional reasoning. Psychological Review,
106, 62–88.
Johnson-Laird, P., Legrenzi, P., & Legrenzi, S. (1972). Reasoning and a sense of reality.
British Journal of Psychology, 63, 395–400.
Kane, M. J., & Engle, R. W. (2003). Working memory capacity and the control of
attention: The contributions of goal neglect, response competition, and task set to
Stroop interference. Journal of Experimental Psychology: General, 132, 47–70.
Lo, Y., Sides, A., Rozelle, J., & Osherson, D. (2002). Evidential diversity and premise
probability in young children’s inductive judgment. Cognitive Science, 26, 181–
206.
López, A., Atran, S., Coley, J. D., Medin, D. L., & Smith, E. E. (1997). The tree of life:
Universal and cultural features of folkbiological taxonomies and inductions. Cognitive
Psychology, 32, 251–295.
López, A., Gelman, S. A., Gutheil, G., & Smith, E. E. (1992). The development of
category-based induction. Child Development, 63, 1070–1090.
McDonald, J., Samuels, M., & Rispoli, J. (1996). A hypothesis assessment model of
categorical argument strength. Cognition, 59, 199–217.
Malt, B. C., Ross, B. H., & Murphy, G. L. (1995). Predicting features for members of nat-
ural categories when categorization is uncertain. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 21, 646–661.
Manktelow, K. I., & Over, D. E. (1991). Social roles and utilities in reasoning with deontic
conditions. Cognition, 39, 85–105.
Marr, D. (1982). Vision. San Francisco: W. H. Freeman & Co.
Medin, D. L., Coley, J., Storms, G., & Hayes, B. K. (2003). A relevance theory of induction.
Psychonomic Bulletin & Review, 10, 517–532.
Murphy, G. L., & Ross, B. H. (1999). Induction with cross-classified categories. Memory
& Cognition, 27, 1024–1041.
Myrvold, W. C. (1995). Bayesianism and diverse evidence: A reply to Andrew Wayne.
Philosophy of Science, 63, 661–665.
Nagel, E. (1939). Principles of the theory of probability. Chicago: University of Chicago
Press.
Newstead, S. E., Handley, S. J., Harley, C., Wright, H., & Farrelly, D. (2004). Individual
differences in deductive reasoning. Quarterly Journal of Experimental Psychology, 57,
33–60.
Oaksford, M., & Chater, N. (1994). A rational analysis of the selection task as optimal
data selection. Psychological Review, 101, 608–631.
P1: JZP
0521672443c12 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:45
Osherson, D. N., Smith, E. E., Wilkie, O., López, A., & Shafir, E. (1990). Category-based
induction. Psychological Review, 97, 185–200.
Osman, M. (2004). An evaluation of dual-process theories of reasoning. Psychonomic
Bulletin & Review, 11, 988–1010.
Proffitt, J. B., Coley, J. D., & Medin, D. L. (2000). Expertise and category-based induction.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 811–828.
Rehder, B., & Hastie, R. (2001). Causal knowledge and categories: The effects of causal
beliefs on categorization, induction, and similarity. Journal of Experimental Psychology:
General, 130, 323–360.
Rips, L. J. (2001). Two kinds of reasoning. Psychological Science, 12, 129–134.
Ross, B. H., & Murphy, G. L. (1999). Food for thought: Cross-classification and category
organization on a complex real-world domain. Cognitive Psychology, 38, 495–553.
Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231–280.
Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological
Bulletin, 119, 3–22.
Sperber, D., & Wilson, D. (1986/1995). Relevance: Communication and cognition. Oxford:
Basil Blackwell.
Stanovich, K. E. (1999). Who is rational? Studies of individual differences in reasoning.
Mahwah, NJ: Erlbaum.
Stanovich, K. E., & West, R. F. (1998a). Individual differences in rational thought. Journal
of Experimental Psychology: General, 127, 161–188.
Stanovich, K. E., & West, R. F. (1998b). Cognitive ability and variation in selection task
performance. Thinking and Reasoning, 4, 193–230.
Stanovich, K. E. & West, R. F. (1998c). Individual differences in framing and conjunction
effects. Thinking and Reasoning, 4, 289–317.
Tversky, A., & Kahneman, D. (1983) Extensional vs. intuitive reasoning: The conjunction
fallacy in probability judgment. Psychological Review, 90, 293–315.
Unsworth, N., & Engle, R. W. (2005). Working memory capacity and fluid abilities:
Examining the correlation between Operation Span and Raven. Intelligence, 33, 67–
81.
Wason, P. (1966) Reasoning. In B. Foss (Ed.), New horizons in Psychology (pp. 135–151).
Harmondsworth: Penguin.
Wayne, A. (1995). Bayesianism and diverse evidence. Philosophy of Science, 62, 111–121.
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
13
Taxonomizing Induction
Steven A. Sloman
inference. But in the course of their argument, Rips and Asmuth review a
lot of evidence that students find deductive proofs hard to generate and that
they confuse them with demonstrations via examples. Their data give the
impression that nonexperts rarely rely on deductive inference.
The authors seem to agree that reasoning is either inductive or deductive.
Heit, Oaksford and Hahn, and Rips and Asmuth all make this dichotomy
explicit. So does Thagard, who views abduction as a species of induction
because both involve uncertain conclusions. On this view, if little reasoning
is deductive, then almost all reasoning is inductive.
The field of psychology at large seems to agree with this conclusion. Less
and less work is being done on how people make deductive inferences. One
of the central players in that line of work has concluded that logic is rarely
a reasonable standard to evaluate human reasoning and that most reasoning
is pragmatic and probabilistic (Evans, 2002). With the exception of Johnson-
Laird (2005), this seems to be a widely held view.
Of course, the claim that most reasoning is inductive is not terribly strong.
If we follow Skyrms (2000) in holding that an inductive argument is merely
one in which the conclusion is not necessary given the premises (see Heit’s
chapter), we are left with a vast number of argument forms and types. The
set of inductive arguments includes those that seem almost deductively valid
(if Houdini could hold his breath for fifteen minutes, then he could hold his
breath for fifteen minutes and one second), arguments that are exceptionally
weak (lobster is expensive today, therefore lobster was expensive in the 17th
century), causal arguments (A causes B and A occurred, therefore B probably
occurred), abductions (see Thagard’s chapter), and many more. Is there one
big homogeneous space of inductive arguments, or can they be grouped in
meaningful ways? This is the second theme of the book.
Normative Criteria
The implicit claim in Tenenbaum, Kemp, and Shafto (Chapter 7), made ex-
plicitly by Oaksford and Hahn, is that there is a single normative model of
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
330 S. A. Sloman
On Computational Models
Tenenbaum, Kemp, and Shafto offer a valuable perspective on induction,
one that they claim is a computational-level model of human induction. Is
their model normative, identifying whether behaviors are well designed to
achieve their users’ goals, or descriptive, identifying which behaviors will be
produced? When Marr (1982) introduced the concept of a computational
model, he intended it as a functional description of what an organism is
trying to accomplish, the problems it faces, as opposed to the strategies it
uses to accomplish them (an algorithmic model) or how those strategies are
realized in a physical system (an implementational model).
A computational model in Marr’s sense is neither normative nor descriptive.
It is not normative in that it merely characterizes a problem without specifying
whether or not that is the right problem to solve. The characterization might
be correct or incorrect, but either way nothing is implied about whether the
organism’s thought or behavior is effective or not. A computational theory is
not descriptive because it is a representation of a problem in the world, not
necessarily anything that an organism does or even knows about at any level.
A computational theory of nest building by an ant colony describes a problem
that no single ant is trying to solve. The solution is an emergent property of
the behavior of a large number of ants blind to any larger purpose. Similarly, a
computational model of human induction, like the Bayesian model suggested
by Tenenbaum et al., is an abstract representation of a problem. Solving
that problem, or just trying to, does not require that people have access to
that computational representation. People do not need to know Bayes’s rule
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
332 S. A. Sloman
Descriptive Models
The most pertinent question for this book is whether human inductive rea-
soning occurs in one or more fundamentally different ways. The chapters
offer several ways of cutting the pie:
no cut. Oaksford and Hahn argue for a single method of making inductive
inferences, one that conforms to Bayesian inference. They show that a range
of judgments about the strength of an inductive inference is consistent with
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
Bayesian tenets. This broadly supports the stance of Tenenbaum, Kemp, and
Shafto.
The strength of this proposal is its breadth. It offers a conception that covers
induction in all its forms and, unlike any other model, proposes a method that
is both mathematically well-defined and highly flexible in its ability to adjust
to all forms of relational knowledge. A danger of this approach is that with
great flexibility comes great freedom to fit all patterns of data, both real and
imagined. Theoretical power is a curse when it comes to demonstrating the
empirical value of a model because it allows the model to wrap itself around
any potential result, even those not found.
334 S. A. Sloman
336 S. A. Sloman
338 S. A. Sloman
.
Although the chapters in this collection offer a variety of perspectives on
how to distinguish inductive forms, a theme common to many chapters is
the central role played by causal knowledge in inductive inference. Hayes
focuses on the role of causal relations in guiding people to make inferences
based on causally relevant properties rather than causally irrelevant ones.
Shafto et al. and Feeney provide strong evidence that people reason with
causal relations when they have sufficient knowledge. Rehder offers a causal
model of inductive inference. Tenenbaum et al. claim that causal structure is
an important type of relational structure for inference. Thagard argues that
causality is the core of explanation, and that explanation is the heart of the
abductive inferences that he discusses.
These authors are all in one way or another echoing Hume (1748), who
argued that inductive inferences are parasitic on causal beliefs (cf. Sloman &
Lagnado, 2005). As regards the psychology of induction, the jury is out about
whether all inductive inferences have a causal basis or whether only some do.
As many of the chapters attest, the evidence is overwhelming that other prin-
ciples also apply to particular inductive inferences. Some inferences clearly
appeal to similarity and others to categorical relations (Murphy & Ross). But it
might be that these other relations are just proxies for causal structure. Objects
that are more similar tend to share more causal roles, so similarity provides a
heuristic basis for making causally appropriate inferences. Hadjichristidis et
al. (2004) show that people’s inferences are proportional to a property’s causal
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
centrality but only to the extent that the objects are similar. This suggests that
people treat similarity as a cue for the extent to which objects share causal
structure. And when similarity conflicts with causal relations, causality wins
(Hayes).
Category knowledge serves as a proxy for causal structure in a couple of
senses. First, some categories are defined by their role in a causal relation. The
category of “brown things” is the set of things that cause light to reflect off
of them such that visual systems perceive brown. The category of “goals in a
hockey game” is the set of events that consist of an outcome that gives a hockey
team a point. Second, more complex categories are also associated with causal
roles, frequently multiple causal roles. For instance, the set of objects that we
call “automobile” tend to participate in causal roles involving transportation
of people, systems of steering and acceleration, expensive repairs, and so forth.
The most direct explanation for how an induction is made is not necessarily
causal. But there might be a causal explanation at some level operating in the
background to motivate the relational structure used to make the inductive
inference in every case.
.
The range of topics covered by the chapters in this book exposes the point
that Shafto et al. make explicitly: People use a variety of strategies and frames
for assessing inductive strength in everyday life. Strategies seem to be arrayed
along a series of levels that require progressively more conceptual effort and
provide progressively more justifiable responses. The hierarchy conforms to
the trade-off between cognitive effort and cognitive effect that determines
relevance (Wilson & Sperber, 2004). At the bottom of the ladder, induction
makes use of rough guesses and, at the top, involves deliberation including
construction and calculation:
Most effort, most justifiable: Set theoretic frames (including
categorical)
Some effort, moderate justifiability: Causal models
Least effort, least justifiable: Associative strategies (correlational;
similarity)
The bottom level involves pattern recognition (Sloutsky & Fisher, 2004). Ob-
jects that have a particular pattern have some property; therefore objects with
similar patterns have that property. If I know a particular kind of mushroom
is poisonous, then I avoid all mushrooms that look like it unless I have more
specific information to help me along. But if I have causal knowledge to
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
340 S. A. Sloman
deploy, and I have the time and motivation to deploy it, then I can do better.
For instance, if I can identify the poison-relevant attributes of the mushroom,
or if I can figure out from the first mushroom where the poison resides and
how to get rid of it, or if I can use the first mushroom to obtain some kind
of chemical imprint of the poison, then I can use that new information to
generalize to new mushrooms. I can use a causal model to draw a more spe-
cific and more certain conclusion about a novel object (Sloman, 2005). This
requires more effort, but the greater specificity and certainty offers a bigger
payoff.
However, causal knowledge is not perfect and causal relations might all be
probabilistic (Pearl, 2000), so in some cases one can do better than by using
a causal model. For instance, if in addition to causal knowledge, I know that
99% of the mushrooms in the area are poisonous, I would be unlikely to risk
eating the mushroom under any circumstances. This third kind of inference
is set theoretic in the sense that it involves a representation or a calculation
over a set of objects (all the mushrooms in the area). Lagnado and Sloman
(2004) argue that thinking about sets involves an outside view of a category
in terms of its extension as a set of instances, and that such thinking requires
special effort. People find it easier to think about categories from the inside, in
terms of their properties and relations among those properties. In the context
of induction, the distinction can be seen most clearly in the inclusion fallacy
of Shafir, Smith, and Osherson (1990):
versus
Most people judge the first argument stronger, although it cannot be if you
believe that all penguins are birds. The first argument entails that penguins
as well as all other birds have the property; the second that only penguins
do. So the second argument must be stronger, and this is obvious once you
see the category inclusion relation between penguins and birds – a relation
among sets. But most people do not consider that relation until it is pointed
out to them or made transparent in some other way. This kind of thinking
generally involves the greatest effort, especially without cues to aid it along,
but it offers the biggest payoff. In this case, it affords certainty about the
stronger argument.
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
The first level of the hierarchy is clearly associative and the top level is
clearly not. This is consistent with Feeney’s evidence that some inferences
are made intuitively and others involve deliberative processing. Some
dual-process models (Sloman, 1996) imply that we always engage in asso-
ciative reasoning at some level, that is, that the similarity-based response is
always available. In order to engage in deliberative processing, we need to
take whatever cues we can from language, perception, or elsewhere in order
to construct an effective organizational frame.
Beyond dual processes, the proposed framework is consistent with other
ideas suggested in this book. It is consistent with the idea that people use
similarity if a causal model is not available as long as no set-theoretic alterna-
tive is. One set-theoretic alternative is a category hierarchy, and Murphy and
Ross’s evidence that people rely on a single category when making inductions
may reflect a constraint on how people think about sets. The proposal is also
consistent with Tenenbaum et al.’s proposal that people use a variety of rela-
tional schemes that are individually appropriate for different inferences. But
the proposal does entail that we cannot reduce every inductive inference that
people make to a single inferential model, unless we choose a model – like
unconstrained Bayesian inference – that is so general that it does not have pre-
dictive power. To successfully predict human induction, the next generation
of models will have to respect this variety in inductive procedures.
This work was funded by NSF Award 0518147. It has benefited greatly from
the comments of Aidan Feeney, Evan Heit, and especially Phil Fernbach.
References
Barbey, A. K., & Sloman, S. A. (2006). Base-rate respect: From ecological rationality to
dual processes. Behavioral and Brain Sciences.
Dougherty, M. R. P., Gettys, C. F., & Thomas, R. P. (1997). The role of mental simulation
in judgments of likelihood. Organizational Behavior and Human Decision Processes,
70, 135–148.
Evans, J. St. B. T. (2002). Logic and human reasoning: An assessment of the deduction
paradigm. Psychological Bulletin, 128, 978–996.
Evans, J. St. B. T., & Over, D. E. (1996). Rationality and reasoning. Hove, UK: Psychology
Press.
Fernbach, P. M. (2006). Sampling assumptions and the size principle in property in-
duction. Proceedings of the Twenty-Eigth Annual Conference of the Cognitive Science
Society, Vancouver, Canada.
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
342 S. A. Sloman
Fischhoff, B., Slovic, P., & Lichtenstein, S. (1978). Fault trees: Sensitivity of estimated
failure probabilities to problem representations. Journal of Experimental Psychology:
Human Perception and Performance, 4, 330–344.
Gilovich, T., Griffin, D., & Kahneman, D. (2002). Heuristics and biases: The psychology
of intuitive judgment. Cambridge: Cambridge University Press.
Hadjichristidis, C., Sloman, S. A., Stevenson, R. J., Over, D. E. (2004). Feature centrality
and property induction. Cognitive Science, 28, 45–74.
Howson, C., & Urbach, P. (1993). Scientific reasoning: The Bayesian approach. Second
edition. Open Court, Chicago.
Hume, D. (1748). An enquiry concerning human understanding. Oxford: Clarendon.
Johnson-Laird, P. (2005). Mental models and thought. In K. Holyoak & R. Morrison
(Eds.), The Cambridge handbook of thinking & reasoning, pp. 185–208. New York:
Cambridge University Press.
Lagnado, D. A., & Shanks, D. R. (2003). The influence of hierarchy on probability
judgment. Cognition, 89, 157–178.
Lagnado, D., & Sloman, S. A. (2004). Inside and outside probability judgment. D. J.
Koehler & N. Harvey (Eds.), Blackwell handbook of judgment and decision making,
pp. 157–176. Oxford, UK: Blackwell Publishing.
Lombrozo, T. (in press). The structure and function of explanations. Trends in Cognitive
Science.
Marr, D. (1982). Vision: A computational investigation into the human representation
and processing of visual information. New York: W. H. Freeman and Company.
McDonald, J., Samuels, M., & Rispoli, J. (1996). A hypothesis assessment model of
categorical argument strength. Cognition, 59, 199–217.
Oaksford, M., & Chater, N. (Eds.) (1998). Rational models of cognition. Oxford, UK:
Oxford University Press.
Osherson, D.N., Smith, E. E., Wilkie, O., Lopez, A., & Shafir, E. (1990). Category-based
induction. Psychological Review, 97, 185–200.
Quine, W. V. (1970). Natural kinds. In N. Rescher (Ed.), Essays in honor of Carl G.
Hempel, pp. 5–23. Dordrecht: D. Reidel.
Shafir, E., Smith, E., & Osherson, D. (1990). Typicality and reasoning fallacies. Memory
and Cognition, 18, 229–239.
Skyrms, B. (2000). Choice and chance: An introduction to inductive logic. (Fourth edition).
Belmont, CA: Wadsworth.
Sloman, S. A. (1994). When explanations compete: The role of explanatory coherence
on judgments of likelihood. Cognition, 52, 1–21.
Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological
Bulletin, 119, 3–22.
Sloman, S. A. (1998). Categorical inference is not a tree: The myth of inheritance
hierarchies. Cognitive Psychology, 35, 1–33.
Sloman, S. A., & Lagnado, D. (2005). The problem of induction. In K. Holyoak & R.
Morrison (Eds.), The Cambridge handbook of thinking & reasoning, pp. 95–116. New
York: Cambridge University Press.
Sloman, S. A., & Wisniewski, E. (1992). Extending the domain of a feature-based model
of property induction. Proceedings of the Fourteenth Annual Conference of the Cognitive
Science Society, Bloomington, IN.
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
Sloutsky, V. M., & Fisher, A. V. (2004). Induction and categorization in young children:
A similarity-based model. Journal of Experimental Psychology: General, 133, 166–188.
Stanovich, K. E. (1999). Who is rational? Studies of individual differences in reasoning.
Mahwah, NJ: Erlbaum.
Tenenbaum, J. B. (1999). Bayesian modeling of human concept learning. In M. Kearns,
S. Solla, & D. Cohn (Eds.), Advances in neural information processing systems 11,
pp. 59–65. Cambridge: MIT Press.
Tenenbaum, J. B., & Griffiths, T. L. (2001). Generalization, similarity and Bayesian
inference. Behavioral and Brain Sciences, 24, 629–640.
Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The con-
junction fallacy in probability judgment. Psychological Review, 90, 293–315.
Wilson, D., & Sperber, D. (2004). Relevance theory. In L.R. Horn & G. Ward (Eds.), The
handbook of pragmatics, pp. 607–632. Oxford, UK: Blackwell Publishing.
P1: IBE
0521672443c13 CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:47
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
Index
Abrahamsen, A.A., 242, 244 138, 139, 140, 164, 165, 179, 180, 183,
Adams, E.W., 272, 278, 280, 281, 283, 295, 201, 310, 326
298 Attribution, 231
Adolphs, R., 305, 324 Availability, 114–134
Aguilar, C.M., 141, 163
Ahn, W., 86, 111, 113, 139, 163, 241, 244 Bacon, F., 12, 22
Alfonso-Reese, L.A., 173, 201 Baer, L., 55, 80
Algebra and mathematical induction, 259–60 Bailenson, J.N., 96, 111
Allen, J.L., 305, 325 Baillargeon, R., 80, 215, 224, 235, 244
Analogical reasoning, 50, 230, 240–241, 243 Baker, A., 122, 134, 135
Anderson, A.R., 297, 298 Baldwin, D.A., 26, 42, 51, 115, 131, 132, 135
Anderson, J.R., 134, 169, 197, 198, 201, 206, Baraff, E., 115, 124, 126, 135, 136, 194, 203
207, 212, 219, 220, 222, 224, 261, 262, Barbey, A.K., 332, 341
264, 268, 280, 298 Barsalou, L.W., 59, 78, 131, 132, 135, 138, 163
Andrews, G., 50, 52 Barston, J.L., 303, 304, 325
Anggoro, F., 64, 78 Base rate neglect, 211, 332
Argument from ignorance, 272, 290–294, 296 Basic level categories, 29, 103
Argument strength, Batsell, R.L., 154, 163
and AI knowledge representation, 297 Bauer, P., 29, 42, 51
and certainty, 5 Bay, S., 230, 246
and informal reasoning fallacies, 290–294 Bayesian approach,
and likelihood ratio, 287–288 and normative justification, 329–331
measures of, 5, 286–290 to belief updating in conditional reasoning,
Aristotle, 273, 290 285–286
Artificial categories, 208–210 to deductive reasoning, 8
Artificial expert systems, 152, 161–163 to evidential diversity, 14, 21
Asgharbeygi, N., 230, 246 to inductive reasoning, 21–22, 45–46,
Ashby, F.G., 173, 201 110–111, 167–200, 206–207, 218–223,
Asher, S.J., 170, 173, 193, 202 313, 318–321
Asmuth, J., 22, 266, 268, 328, 329 to informal reasoning fallacies, 290–294
Asymmetries in conditional inference, 283–286 Bayesian networks, 187, 230
Asymmetries in similarity judgment, 141 Bayes’ rule, 177
Asymmetries of projection , 55–78, 141, 172, Bayindir, H., 290, 293, 294, 299
184 Bechara, A., 305, 324
Atran, S., 16, 24, 32, 51, 55, 57, 60, 61, 66, 71, Bechtel, W., 242, 244
78, 79, 96, 103, 111, 112, 123, 124, 135, Belief bias, 303–306
345
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
346 Index
Index 347
348 Index
Eliasmith, C., 233, 239, 244, 245 Fry, A.F., 306, 325
Emotion and reasoning, 227–228, 233–234, Fugelsang, J.A., 236, 245
240–241
Enderton, H., 274, 298 Gabbay, D., 296, 298, 299, 300
Engle, R.W., 305, 306, 326, 327 Gap model, 139, 142
Essentialism, 48–49, 103, 107, 111 Gap2 model, 109, 142–152, 163
Estin, P., 103, 111 Gati, I., 137, 166
Evans, J. St. B.T., 9, 23, 261, 267, 270, 280, 298, Gazzaniga, M. S., 236, 245
302, 303, 304, 305, 306, 325, 329, 334, Gelman S.A., 18, 19, 23, 24, 28, 29, 30, 31, 33,
341 34, 36, 37, 42, 43, 44, 45, 46, 48, 49, 51,
Evolutionary model, 181–183, 188–197 52, 53, 55, 57, 62, 78, 80, 85, 100, 102,
Example-based strategies in maths induction, 103, 107, 112, 138, 139, 163, 165, 168,
258–263, 266 172, 186, 187, 201, 204, 216, 224, 241,
Exemplars and inductive inference, 221–223 244, 312, 314, 315, 326
Expert systems, Gentner, D., 41, 50, 51, 66, 74, 79, 90, 113, 137,
see Artificial expert systems 141, 164, 165
Expertise, 16, 38, 96, 116, 124–127, 129, 310 Geometry and mathematical induction,
Explanation, 260–263
accounts of, 234–237 George, C., 280, 298
and abductive reasoning, 337–338 Gettys, C.E., 337, 341
and causality, 83, 95–102,109, 234–237 Gigerenzer, G., 138, 164
and induction, 41, 85, 338 Gilinsky, A.S., 304, 325
Extensional reasoning, 90–95 Gilovich, T., 230, 245, 332, 342
Girotto, V., 8, 24, 271, 295, 299, 321, 326
Fallis, D., 265, 267 Gleitman, H., 60, 78
Farrar., 27, 49, 51 Gleitman, L.R., 60, 78
Farrelly, D., 307, 308, 313, 316, 325 Goel, V., 10, 23, 240, 245, 248, 267, 305, 325
Fazio, F., 10, 24, 248, 268, 297, 300 Gold, B., 10, 23, 248, 267
Feature alignment, 141 Gold, E., 160, 164
Feature-based model, 20–21, 44–45, 65, 76, Goldfarb, W., 253, 267
130, 174–175, 310–311, 316–317, 337 Goldstone, R.L., 41, 51, 66, 79, 137, 141, 164,
Feature centrality, 86–87 165
Feeney, A., 6, 15, 17, 22, 23, 38, 52, 120, 135, Gooding, D.C., 13, 23
307, 310, 313, 319, 321, 322, 323, 324, Goodman, N., 15, 23, 62, 78, 137, 164, 174,
325, 334, 336, 338, 341 201
Feinberg, S.E., 286, 296 Gopnik, A., 168, 201, 204, 228, 245
Feldman, J., 236, 245 Gordon, M., 200, 203
Fernbach, P.M., 336, 341 Gorman, M.E., 13, 23
Fischhoff, B., 337, 342 Goswami, U., 50, 51
Fisher A.V., 28, 42, 45, 48, 49, 52, 339, 342 Graham, S.A., 27, 28, 30, 42, 51, 52, 54
Fitelson, B., 145, 164 Grassi, F., 10, 24, 248, 267, 297, 300
Flach, P.A., 230, 245 Grattan-Guinness, I., 252, 267
Florian, J., 29, 42, 51 Gray, W.D., 31, 53, 103, 113
Folkbiology, 55, 70, Green, B., 192, 202
Fong, G.T., 309, 325 Green, D.M., 292, 298
Foodwebs, 184–186, 194–195 Greenberg, M.J., 258, 263, 267
Forsyth, B., 276, 301 Greenfield, P.M., 26, 51
Fox, J., 296, 298 Griffin, D., 230, 245, 332, 342
Freedman, R., 55, 78 Griffiths, T.L., 14, 24, 168, 177, 182, 186, 194,
Frege, G., 248, 253, 254, 255, 265, 267 197, 201, 202, 204, 330, 331, 332, 333,
Frey, M.C., 306, 325 336, 337, 338, 341, 343
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
Index 349
Gutheil, G., 18, 23, 24, 30, 33, 34, 42, 45, 49, Houde, O., 240, 245
52, 55, 57, 78, 100, 112, 138, 165, 312, Houle, S., 10, 23, 248, 268
314, 315, 326 Howson, C., 14, 23, 310, 326, 330, 342
Hu, R., 119, 127, 136
Hadjichristidis, C., 86, 87, 88, 101, 108, 110, Huang, Y.T., 30, 51
112, 338, 342 Hughes, G.E., 278, 299
Hahn, E., 27, 35, 36, 43, 48, 53 Hume, D., 13, 23, 328, 338, 342
Hahn, U., 15, 18, 19, 22, 23, 34, 42, 52, 138, Hypothesis-based model, 66
164, 286, 287, 288, 290, 292, 293, 294,
299, 300, 310, 326, 328, 329, 331 Inagaki, K., 31, 52, 55, 61, 65, 66, 67, 68, 78
Hale, S., 306, 326 Inclusion similarity effect, 335
Halford, G., 50, 52 Incoherence, 154, 156, 160
Hallett, M., 253, 267 Individual differences,
Hamblin, C.L., 290, 299 and cognitive constraints, 306
Hampton, J., 151, 164 and conjunction fallacy, 319–321
Handley, S.J., 6, 23, 280, 285, 298, 301, 305, and diversity, 309–317
307, 308, 313, 316, 325, 326 and dual process theories, 10, 306–309
Hanson, N.R., 229, 245 and induction, 309–324
Harel, G., 260, 268 and normative theories, 308, 309–310
Harley, C., 307, 308, 313, 316, 326 and Wason’s selection task, 307–308
Harman, G., 7, 23, 228, 245 and uncertain classification, 318
Hasher, L., 305, 325 Inductive bias, 168
Hastie, R., 103, 104, 106, 107, 111, 113, 320, Inductive selectivity, 117, 125–127, 132–134
327 Infancy, reasoning in, 27–28, 30, 35–36,
Hatano, G., 31, 52, 55, 61, 65, 67, 68, 78 Informal reasoning fallacies, 272–273, 290–294
Hattori, M., 287, 299 Inhelder, B., 26, 27, 52
Hayes, B.K., 1, 16, 19, 22, 23, 31, 32, 34, 38, 39, Intuitive theories, 168, 187
40, 43, 44, 45, 46, 52, 53, 54, 60, 67, 78, and cognitive development, 186
97, 109, 110, 111, 112, 119, 120, 121,
131, 132, 135, 138, 165, 197, 202, 310, Jaswal, V., 29, 42, 52
312, 319, 320, 326, 333, 334, 336, 338, Jeffrey, R.C., 153, 164
339 Jepson, C., 35, 53, 309, 326
Heim, A. W., 313, 325 Johnson, D.M., 31, 52, 103, 113
Heit, E., 1, 8, 11, 14, 15, 16, 17, 18, 19, 21, 22, Johnson, S., 55, 79
24, 34, 35, 38, 42, 44, 45, 46, 51, 84, 85, Johnson-Laird, P.N., 7, 8, 23, 24, 147, 164, 269,
100, 109, 110, 112, 117, 120, 130, 132, 271, 273, 289, 295, 299, 307, 321, 326,
135, 138, 139, 164, 170, 172, 175, 188, 329, 337, 342
189, 201, 202, 207, 216, 218, 224, 276, Jones, S.S., 28, 52
299, 303, 309, 310, 312, 313, 317, 320, Josephson, J.R., 229, 230, 231, 232, 233, 234,
321, 325, 326, 328 235, 236, 237, 238, 239, 240, 245
Hempel, C.G., 13, 23, 310, 326 Josephson, S.G., 229, 230, 231, 232, 233, 234,
Henderson, A.M., 30, 51 235, 236, 237, 238, 239, 240, 245
Heyman, G.D., 29, 37, 42, 43, 51 Judd, B.B., 304, 325
Hintzman, D.L., 170, 173, 193, 202 Juslin, P., 142, 164
Holland, J.H., 13, 23 Just, M.A., 306, 324
Holland, P.W., 286, 299
Holyoak, K.J., 13, 23, 243, 245, 246, 247, 307, Kahn, T., 34, 43, 52
324 Kahneman, D., 69, 79, 114, 115, 136, 138, 164,
Horn, J.L., 305, 326 166, 211, 225, 230, 245, 257, 267, 303,
Horowitz, L.M., 115, 134, 135 308, 318, 319, 327, 332, 342, 343
Horwich, P., 14, 23 Kakas, A.C., 230, 245
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
350 Index
Kalish, C.W., 37, 43, 52, 216, 224, 241, 244 Loose, J.J., 48, 52
Kalish, W.B., 139, 163 Lopez, A., 8, 16, 18, 19, 20, 21, 22, 24, 27, 32,
Kane, M.J., 305, 326 33, 34, 44, 49, 52, 58, 59, 76, 79, 121,
Kane, R., 118, 136 123, 130, 135, 138, 139, 151, 165, 170,
Kapur, S., 10, 23, 248, 267 171, 173, 174, 180, 188, 192, 193, 203,
Kaye, R., 255, 267 216, 225, 310, 311, 312, 314, 315, 317,
Keeble, S., 235, 246 326, 337
Keil, F.C., 48, 52, 55, 57, 78, 85, 113, 137, 164 Love, B.C., 82, 96, 100, 112
Kelemen, D., 37, 43, 44, 52 Luenberger, D.G., 155, 165
Kemp, C., 110, 111, 115, 122, 135, 177, 182, Lynch, E.B., 55, 57, 61, 66, 67, 68, 69, 70, 71,
183, 186, 187, 193, 194, 200, 202, 203, 72, 73, 74, 78, 79, 80, 124, 135
204, 329, 330, 331, 332, 333, 337, 338,
341 Maass, W., 233, 246
Kilbreath, C.S., 27, 28, 42, 51 Machamer, P., 233, 242, 246
Kim, N.S., 86, 111 Machine-learning algorithms, 168
Kim, S.W., 86, 113 Macrae, C.N., 215, 224
Kincannon, A.P., 13, 23 Madden, E.H., 229, 245
Klahr, D., 230, 245 Madole, K.L., 41, 53
Kline, M., 252, 268 Magnani, L., 229, 231, 246
Knuth, D.E., 255, 268 Malt, B.C., 207, 208, 224, 318, 326
Koedinger, K.R., 261, 262, 264, 268 Mandler, J.M., 35, 36, 43, 53, 236, 246
Kotovsky, L., 235, 244 Manktelow, K.I., 307, 326
Krantz, D.H., 35, 53, 309, 325, 326 Mansinghka, V., 200, 203
Kruschke, J.K., 173, 193, 202 Marcus-Newhall, A., 231, 246
Kunda, Z., 35, 53, 227, 231, 245 Mareschal, D., 48, 52
Markman A.B., 30, 53, 55, 74, 78, 79, 141,
Lagnado, D.A., 213, 214, 224, 230, 246, 337, 164
338, 139, 165, 340, 342 Markman, E.M., 26, 28, 29, 32, 33, 36, 42, 45,
Lakoff, G., 236, 246 51, 53, 85, 112, 172, 179, 201, 202
Langley, P., 230, 246 Markov mutation model, 182
Larkin, J., 270, 283, 295, 300 Markovits, H., 285, 300
Lassaline, M.E., 38, 41, 52, 86, 90, 101, 102, Marr, D., 169, 187, 202, 313, 326, 331, 342
103, 104, 105, 106, 107, 108, 109, 110, Mathematical induction, 248–267, 328–329
111, 112, 139, 164 Martin, W.G., 260, 268
Lawson, C., 31, 32, 45, 51 Massey, G.J., 290, 299
Legrenzi, P., 8, 24, 271, 295, 299, 307, 321, 326 Maximal similarity, 33, 311
Legrenzi, M.S., 8, 24, 271, 295, 299, 307, 321, as an approximation for evolutionary
326 model, 194, 196
Lemmon, E.J., 256, 257, 268 versus summed similarity, 173–174, 188,
Leslie A.M., 57, 79, 235, 246 192–194
Levin, D.T., 137, 164 Mazoyer, B., 240, 245
Lewis, D., 278, 281, 299 McCarrel, N.S., 37, 43, 53
Lichtenstein, S., 337, 342 McClelland, J.L., 35, 48, 53, 174, 175, 203
Lien, Y., 230, 239, 246 McCloskey, M., 192, 202
Lipton, P., 228, 246 McDermott, D., 229, 245
Litt, A., 237, 243, 244, 247 McDonald, J., 46, 53, 66, 79, 317, 326, 335, 342
Liu, I.M., 280, 299 McDonough, L., 35, 36, 43, 53
Lo, Y., 14, 15, 19, 30, 33, 34, 42, 43, 52, 138, McKenzie, B.R.M., 197, 202
139, 164, 280, 299, 310, 326 Mechanisms
Logical fallacies, 272, and abduction, 241–243
Lombrozo, T., 338, 342 causal, 83, 85
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
Index 351
Medin, D.L., 16, 22, 23, 24, 28, 32, 38, 46, 51, Neural mechanisms,
53, 55, 57, 59, 60, 61, 63, 64, 66, 67, 68, and inference, 237–239
69, 70, 71, 72, 73, 74, 77, 78, 79, 96, as dynamical systems, 232
103, 109, 110, 111, 112, 113, 119, 120, Neuropsychological studies of reasoning, 10,
121, 123, 124, 131, 132, 135, 137, 138, 240, 248–249, 297–298, 305
139, 140, 141, 142, 163, 164, 165, 168, Newstead, S.E., 261, 267, 305, 307, 308, 313,
174, 192, 197, 202, 216, 222, 223, 224, 316, 325, 326
225, 241, 244, 310, 312, 319, 320, 325, Nguyen, S.P., 118, 135, 216, 225
326, 327, 334, 336 Nilsson, N., 153, 165
Melartin, R.L., 26, 32, 45, 51 Nisbett, R.E., 13, 23, 35, 53, 124, 125, 135, 246,
Mellers, B., 240, 246 231, 309, 325, 326
Meltzoff, A., 168, 201 Niyogi, S., 168, 186, 204
Mental model theory, 7–8, 273–274, 285–286 Non-diversity via property reinforcement
Mervis, C.B., 31, 53, 59, 79, 103, 113 effect, 16–17, 120
Michalski, R., 63, 78, Non-diversity via causality effect, 97
Michotte, A., 235, 246 Non-monontonicity effects
Mikkelsen, L.A., 197, 202 in adults, 121,
Miller, C., 60, 78 in children, 33, 43, 46
Miller, D.T., 215, 225, 227, 245 Non-monotonicity via property reinforcement
Milne, A.B., 215, 224 effect, 121, 129
Milson, R., 134 Non-monotonicity and deduction, 270–271,
Mirels, H.L., 160, 164 282,
Misinterpreted necessity model, 261 Norenzayan, A., 124, 135
Mitchell, T.M., 168, 178, 202 Norman, S.A., 115, 134, 135
Modus ponens, 229, 269 Normative accounts of induction, 13–15,
and mathematical induction, 252–254, 329–332
265–267 Norvig, P., 152, 165
Moloney, M., 31, 32, 45, 51 Nosofsky, R.M., 170, 173, 193, 203
Monotonicity effects, Nute, D., 278, 299
in adults, 32,
in children, 32–35, 43 O’Reilly, A.W., 31, 51, 102, 112
Monotonicity principle, 140, 144, 148–149 Oakes, L: 41, 53
Moutier, F., 240, 245 Oaksford, M., 6, 8, 22, 23, 24, 169, 197, 201,
Multidimensional scaling, 157 203, 269, 270, 271, 273, 274, 280, 283,
Multimodal representations, 232–234, 242 284, 285, 286, 287, 288, 289, 290, 292,
Murphy, G.L., 41, 53, 118, 133, 135, 168, 174, 293, 294, 295, 297, 298, 299, 300, 301,
192, 197, 198, 202, 203, 207, 208, 209, 303, 307, 308, 325, 326, 328, 329, 331,
210, 211, 212, 213, 214, 216, 217, 218, 332, 342
219, 220, 221, 222, 223, 224, 225, 318, Oberauer, K., 280, 300
326, 327, 336, 337, 338, 341 Olbrechts-Tyteca, L., 293, 300
Mutation history, 181–183 Olver R.R., 26, 51
Mutation principle, 179, 181–183 One-process accounts of reasoning, 7–11,
Myers, T.S., 86, 112 332–333
Myrvold, W.C., 310, 326 Ortiz, C.L., 139, 165
Ortony, A., 103, 112
Nagel, E., 13, 24, 310, 326 Osherson, D.N., 8, 10, 14, 15, 16, 19, 20, 21, 22,
Nair, M., 91, 93, 94, 95, 112 24, 27, 30, 32, 33, 34, 42, 43, 44, 52, 58,
Narayan, S., 236, 245 59, 63, 76, 78, 79, 82, 85, 86, 109, 112,
Neapolitan, R., 160, 165 113, 121, 123, 130, 135, 136, 137, 138,
Needham, A., 235, 244 139, 140, 142, 145, 151, 154, 163, 164,
Nelson, L.J., 215, 225 165, 170, 171, 172, 173, 174, 180, 188,
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
352 Index
Index 353
Rozelle, J., 14, 15, 19, 23, 24, 30, 33, 34, 35, 42, Similarity effects
43, 52, 84, 85, 109, 110, 112, 117, 130, in adults, 83, 171
135, 138, 139, 164, 172, 202, 216, 218, in children, 42, 55–56,
224, 310, 326 Similarity Coverage model, 19–21, 33, 44, 59,
Rubin, D.C., 250, 268 76, 82–85, 95–100, 108, 130, 138–139,
Russell, B., 251, 253, 268 151, 170, 173–174, 188–194, 311–312,
Russell, S.J., 152, 165 324, 337
Simon, H., 230, 246
Salmon, W., 12, 24 Simons, D.J., 137, 164
Samuels, M., 46, 53, 66, 79, 317, 326, 335, SINC model, 45
342 Skyrms, B., 4, 24, 153, 165, 270, 301, 329,
Schaffer, M.M., 143, 165 342
Schaeken, W., 270, 285, 286, 288, 289, 300, Slippery slope argument, 272
301 Sloman, S.A., 8, 9, 20, 21, 22, 24, 27, 44, 65, 75,
Schnur, T., 10, 24, 248, 268 297, 300 76, 79, 83, 85, 86, 87, 88, 101, 108, 110,
Schroyens, W.J., 270, 285, 286, 288, 289, 300, 112, 113, 130, 136, 138, 139, 165, 170,
301 171, 174, 175, 203, 230, 246, 302, 303,
Schulz, L., 168, 201, 203, 204 304, 306, 310, 311, 316, 317,322, 327,
Schwartz, A., 240, 246 332, 334, 335, 337, 338, 340, 341, 342
Scientific reasoning, 12–13, 15, 186–187 Sloutsky, V.M., 28, 42, 45, 48, 49, 53, 339,
Scott, D., 277, 301 342
Scripts, 118, 128 Slovic, P., 337, 342
Seaton, C.E., 118, 135 Smith, E.E., 8, 16, 18, 19, 20, 21, 22, 24, 27, 30,
Selection task. 32, 33, 34, 42, 43, 44, 49, 53, 55, 58, 59,
see Wason’s selection task 76, 79, 82, 85, 86, 96, 100, 112, 113,
Shafir, E., 8, 16, 19, 20, 21, 22, 24, 27, 32, 33, 121, 123, 124, 125, 130, 135, 136, 137,
44, 53, 58, 59,76, 79, 82, 85, 112, 113, 138, 139, 142, 151, 157, 158, 160, 164,
121, 123, 130, 135, 136, 137, 138, 139, 165, 170, 171, 172, 173, 174, 180, 188,
142, 151, 165, 170, 171, 172, 173, 174, 192, 193, 197, 200, 201, 203, 216, 225,
180, 188, 192, 193, 197, 200, 203, 216, 310, 311, 312, 314, 315, 317, 326, 327,
225, 310, 311, 317, 327, 337, 340, 342 337, 340, 342
Shafto, P., 22, 38, 53, 96, 109, 110, 111, 113, Smith, L.S., 28, 52, 256, 258, 268
115, 124, 125, 126, 127, 130, 131, 132, Smith, W.C., 137, 164
133, 135, 136, 172, 194, 197, 200, 203, Smolensky, P., 239, 246
305, 309, 319, 321, 325, 329, 330, 331, Social reasoning, 11, 124–125
332, 333, 337, 338, 339, 341 Sousa, P., 55, 57, 61, 67, 68, 69, 70, 71, 72, 73,
Shanks, D.R., 213, 214, 224, 337, 342 74, 78
Shrager, J., 230, 246 Speeded response paradigm, 305
Shell, P., 306, 324 Spelke, E.S., 57, 79
Shelley, C.P., 229, 230, 231, 240, 246, Sperber, D., 312, 327, 339, 343
247 Springer, K., 36, 53, 85, 113
Shepard, R.N., 15, 24, 197, 203 Stalnaker, R.C., 278, 283, 301
Shipley, E.F., 62, 79, 102, 113 Stalnaker conditional, 283
Shoben, E.J., 59, 79, 160, 157, 158, 165 Stanovich, K.E., 9, 10, 24, 302, 303, 306, 307,
Shum, M.S., 96, 111 308, 309, 310, 316, 317, 318, 319, 327,
Sides, A., 14, 15, 19, 24, 30, 33, 34, 42, 43, 52, 334, 343
138, 139, 164, 307, 310, 326 Stepanova, O., 124, 126, 135, 136
Silverman, B., 173, 203 Stern, L.D., 170, 173, 193, 202
Similarity, 1, 8, 100–102, 109–110, 133, Stevenson, R.J., 86, 87, 88, 101, 108, 110, 112,
137–163, 170, 334, 341 280, 301, 338, 342
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
354 Index
Stewart, I., 251, 252, 253, 254, 268 Two process accounts of reasoning,
Steyvers, M., 197, 204 see dual process theories
Stob, M., 86, 112 Typicality effect
Stochastic processes, 184–185 in adults, 58, 82, 97–98, 171
Storms, G., 38, 46, 53, 60, 67, 79, 96, 109, 110, in children, 29–30, 42, 58–61
111, 112, 119, 120, 121, 131, 132, 135,
138, 165, 197, 202, 310, 312, 319, 320, Ucan E.K., 55, 57, 61, 67, 68, 69, 70, 71, 72, 73,
326, 336 74, 78
Stromsten, S., 182, 194, 201, 202 Uncertain categorisation and induction,
Structured representations, 169, 184–185, 197–199, 205–215, 218–224, 318
192 Uncertainty, aleatory, vs. epistemic, 330
Subordinate level categories, 31–32 Universal generalization, 250, 256–258
Superordinate level categories, 19–20, 29, Unsworth, D., 306, 327
31–32, 35–36 Urbach, P., 14, 23, 310, 326, 330, 342
Sum formula (for mathematical induction),
249–250, 252, Valderrama, N., 55, 78,
Surprise, effects of Van Fraassen, B., 229, 247
in inductive reasoning 62, Vapnarsky, V., 55, 57, 61, 67, 68, 69, 70, 71, 72,
in abductive reasoning, 227 73, 74, 78
Swets, J.A., 292, 298 Vardi, M., 154, 163
Veltman, F., 297, 301
Tall, D., 251, 252, 253, 254, 268 Vera, A., 55, 57, 78
Taxonomic principle, 179–180 Verde, M.F., 219, 220, 225
Taxonomic relations, 57, 114, 116, 118–119, Vitkin, A.Z., 118, 119, 127, 135, 136, 338, 339
122, 125–128, 131–134, 168, 185–186 Vreeswijk, G.A.W., 296, 300
Taxonomic trees, 180 Vygotsky, L.S., 26, 54
Tenenbaum, J.B., 14, 22, 24, 110, 111, 115,
136, 168, 177, 182, 183, 186, 193, 194, Wagar, B.M., 244, 247
197,199, 200, 201, 202, 203, 204, 207, Wagenmakers, E.J., 197, 204
218, 219, 220, 221, 222, 225, 329, 330, Wallsten, T.S., 276, 298, 300
331, 332, 333, 336, 337, 338, 341, Walton, D.N., 297, 301
343 Warland, D., 233, 246
Tentori, K., 138, 145, 165 Wason, P., 307, 308, 325
Testa, T.J., 60, 63, 79 Wason’s selection task, 6, 307–308
Thagard, P., 13, 22, 89, 113, 228, 229, 230, 231, Waxman, S.R., 22, 28, 30, 51, 55, 64, 77, 78, 79,
233, 234, 237, 238, 240, 242, 243, 244, 336
245, 246, 247, 328, 329, 337, 338 Wayne, A.P., 14, 15, 24, 310, 327
Theophrastus’ rule, 297 Weisberg, R. W., 13, 24
Theory-based priors for reasoning, 178–187 Welder, A.N., 27, 28, 42, 51
Thomas, R.P., 337, 341 Wellman, H.M., 57, 80, 168, 186, 187, 204
Thompson, S., 38, 39, 40, 43, 44, 50, Wenzel, A.E., 250, 268
54 West, R.F., 302, 307, 308, 309, 318, 319, 327
Tranel, D., 305, 324 Widdowson, D., 37, 43, 44, 52
Tsvachidis, S., 154, 163 Wilhelm, O., 280, 300
Tulving, E., 115, 135 Wilkie, O., 16, 19, 20, 21, 22, 24, 27, 32, 33, 44,
Turner, H., 139, 166 53, 58, 59, 76, 79, 82, 112, 121, 123,
Tversky, A., 65, 69, 79, 114, 115, 136, 137, 138, 130, 135, 138, 151, 165, 170, 171, 173,
141, 164, 166, 211, 225, 257, 267, 303, 174, 180, 188, 192, 193, 203, 216, 225,
308, 318, 319, 327, 332, 343 310, 311, 317, 327, 337
Tweney, R.D., 13, 23 Wilson, D., 312, 327, 339, 343
P1: PJU
0521672443ind CUFX144-Feeney 0 521 85648 5 July 20, 2007 7:50
Index 355
Wisniewski, E.J., 66, 80, 338, 342 Yopchick, J.E., 118, 135
Wright, H., 307, 308, 313, 316, 326
Wu, J., 280, 299 Zacks, R.T., 305, 325
Wu, M.L., 90, 113 Zago, L., 204, 245
Zwick, R., 276, 301
Xu, F., 177, 179, 197, 204 Zytkow, J., 230, 246