Lecture 2 - Judgment under uncertainty - Heuristics and biases
Lecture 2 - Judgment under uncertainty - Heuristics and biases
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/1738360?seq=1&cid=pdf-reference#references_tab_contents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected].
American Association for the Advancement of Science is collaborating with JSTOR to digitize, preserve and extend access to
Science.
http://www.jstor.org
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
occupation from a list of possibilities
(for example, farmer, salesman, airline
pilot, librarian, or physician)? How do
people order these occupations from
most to least likely? In the representa-
Judgment under Uncertainty: tiveness heuristic, the probability that
Steve is a librarian, for example, is
Heuristics and Biases assessed by the degree to which he is
representative of, or similar to, the
stereotype of a librarian. Indeed, re-
Biases in judgments reveal some heuristics of search with problems of this type has
shown that people order the occupa-
thinking under uncertainty. tions by probability and by similarity
in exactly the same way (1). This ap-
Amos Tversky and Daniel Kahneman proach to the judgment of probability
leads to serious errors, because sim-
ilarity, or representativeness, is not in-
fluenced by several factors that should
affect judgments of probability.
Many decisions are based on beliefs mated when visibility is good because Insensitivity to prior probability of
concerning the likelihood of uncertain the objects are seen sharply. Thus, the outcomes. One of the factors that have
events such as the outcome of an elec- reliance on clarity as an indication of no effect on representativeness but
tion, the guilt of a defendant, or the distance leads to common biases. Such should have a major effect on probabil-
future value of the dollar. These beliefs biases are also found in the intuitive ity is the prior probability, or base-rate
are usually expressed in statements such judgment of probability. This article frequency, of the outcomes. In the case
as "I think that . .. ," "chances are describes three heuristics that are em- of Steve, for example, the fact that
. . .," "it is unlikely that . .. ," and ployed to assess probabilities and to there are many more farmers than li-
so forth. Occasionally, beliefs concern- predict values. Biases to which these brarians in the population should enter
ing uncertain events are expressed in heuristics lead are enumerated, and the into any reasonable estimate of the
numerical form as odds or subjective applied and theoretical implications of probability that Steve is a librarian
probabilities. What determines such be- these observations are discussed. rather than a farmer. Considerations of
liefs? How do people assess the prob- base-rate frequency, however, do not
ability of an uncertain event or the affect the similarity of Steve to the
value of an uncertain quantity? This Representativeness stereotypes of librarians and farmers.
article shows that people rely on a If people evaluate probability by rep-
limited number of heuristic principles Many of the probabilistic questions resentativeness, therefore, prior proba-
which reduce the complex tasks of as- with which people are concerned belong bilities will be neglected. This hypothesis
sessing probabilities and predicting val- to one of the following types: What is was tested in an experiment where prior
ues to simpler judgmental operations. the probability that object A belongs to probabilities were manipulated (1).
In general, these heuristics are quite class B? What is the probability that Subjects were shown brief personality
useful, but sometimes they lead to severe event A originates from process B? descriptions of several individuals, al-
and systematic errors. What is the probability that process B legedly sampled at random from a
The subjective assessment of proba- will generate event A? In answering group of 100 professionals-engineers
bility resembles the subjective assess- such questions, people typically rely on and lawyers. The subjects were asked
ment of physical quantities such as the representativeness heuristic, in to assess, for each description, the prob-
distance or size. These judgments are which probabilities are evaluated by the ability that it belonged to an engineer
all based on data of limited validity, degree to which A is representative of rather than to a lawyer. In one experi-
which are processed according to heu- B, that is, by the degree to which A mental condition, subjects were told
ristic rules. For example, the apparent resembles B. For example, when A is that the group from which the descrip-
distance of an object is determined in highly representative of B, the proba- tions had been drawn consisted of 70
part by its clarity. The more sharply the bility that A originates from B is judged engineers and 30 lawyers. In another
object is seen, the closer it appears to to be high. On the other hand, if A is condition, subjects were told that the
be. This rule has some validity, because not similar to B, the probability that A group consisted of 30 engineers and 70
in any given scene the more distant originates from B is judged to be low. lawyers. The odds that any particular
objects are seen less sharply than nearer For an illustration of judgment by description belongs to an engineer
objects. However, the reliance on this representativeness, consider an indi- rather than to a lawyer should be
rule leads to systematic errors in the vidual who has been described by a higher in the first condition, where there
estimation of distance. Specifically, dis- former neighbor as follows: "Steve is is a majority of engineers, than in the
tances are often overestimated when very shy and withdrawn, invariably second condition, where there is a
visibility is poor because the contours helpful, but with little interest in peo- majority of lawyers. Specifically, it can
of objects are blurred. On the other ple, or in the world of reality. A meek be shown by applying Bayes' rule that
hand, distances are often underesti- and tidy soul, he has a need for order the ratio of these odds should be (.7/.3)2,
and structure, and a passion for detail." or 5.44, for each description. In a sharp
The authors are members of the department of How do people assess the probability violation of Bayes' rule, the subjects
psychology at the Hebrew University, Jerusalem,
Tsrael. that Steve is engaged in a particular in the two conditions produced essen-
1124 SCIENCE, VOL. 185
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
tially the same probability judgments. sample size. Indeed, when subjects In this problem, the correct pos,terior
Apparently,subjects evaluated the like- assessed the distributions of average odds are 8 to 1 for the 4: 1 sample
lihood that a particulardescriptionbe- height for samples of various sizes, and 16 to 1 for the 12: 8 sample, as-
longed to an engineer rather than to a they produced identical distributions. suming equal prior probabilities.How-
lawyer by the degree to which this For example, the probabilityof obtain- ever, most people feel that the first
description was representative of the ing an average height greater than 6 sample providesmuch strongerevidence
two stereotypes,with little or no regard feet was assigned the same value for for the hypothesis ithat the urn is pre-
for the prior probabilitiesof the cate- samples of 1000, 100, and 10 men (2). dominantlyred, because the proportion
gories. Moreover, subjects failed to appreciate of red balls is larger in the first than in
The subjects used prior probabilities the role of sample size even when it the second sample. Here again, intuitive
correctlywhen they had no other infor- was emphasized in the formulation of judgmentsare dominatedby the sample
mation. In the absence of a personality the problem. Consider the following proportionand are essentiallyunaffected
sketch, they judged the probabilitythat question: by the size of the sample, which plays
an unknown individual is an engineer a crucial role in the determination of
to be .7 and .3, respectively,in the two A certaintown is served by two hos- the actual posterior odds (2). In ad-
base-rate conditions. However, prior pitals. In the larger hospital about 45
babies are born each day, and in the dition, intuitive estimates of posterior
probabilities were effectively ignored smallerhospitalabout 15 babiesare born odds are far less extreme than the cor-
when a description was introduced, each day. As you know,about 50 percent rect values. The underestimationof the
even when this description was totally of all babiesare boys. However,the exact impact of evidence has been observed
uninformative. The responses to the percentagevariesfrom day to day. Some-
times it may be higher than 50 percent, repeatedlyin problemsof this type (3, 4).
following descriptionillustratethis phe- sometimeslower. It has been labeled "conservatism."
nomenon: For a period of 1 year, each hospital Misconceptionsof chance. People ex-
Dick is a 30 year old man. He is mar- recordedthe days on whichmore than 60 pect that a sequence of events generated
ried with no children. A man of high percent of the babies born were boys. by a random process will representthe
Which hospital do you think recorded essential characteristicsof that process
ability and high motivation,he promises more such days?
to be quite successfulin his field. He is - The larger hospital (21) even when the sequence is short. In
well liked by his colleagues. - The smallerhospital (21) considering tosses of a coin for heads
This descriptionwas intended to convey - A!boutthe same (that is, within 5 or tails, for example, people regard the
no informationrelevant to the question percentof each other) (53) sequence H-T-H-T-T-H to be more
of whether Dick is an engineer or a The values in parenthesesare the num- likely than the sequence H-H-H-T-T-T,
lawyer. Consequently, the probability ber of undergraduate students who which does not appear random, and
that Dick is an engineer should equal chose each answer. also more likely than the sequence H-H-
the proportion of engineers in the Most subjects judged the probability H-H-T-H, which does not representthe
group, as if no description had been of obtainingmore than 60 percent boys fairness of the coin (2). Thus, people
given. The subjects, however, judged to be the same in the small and in the expect that the essential characteristics
the probability of Dick being an engi- of the process will be represented,not
large hospital,presumablybecause these
neer to be .5 regardlessof whether the events are described by the same sta- only globally in the entire sequence,
stated proportion of engineers in the tistic and are therefore equally repre- but also locally in each of its parts. A
group was .7 or .3. Evidently, people sentative of the general population. In locally representative sequence, how-
respond differentlywhen given no evi- contrast, sampling theory entails that ever, deviatessystematicallyfrom chance
dence and when given worthless evi- the expected number of days on which expectation: it contains too many al-
dence. When no specific evidence is more than 60 percent of ithebabies are ternations and too few runs. Another
given, prior probabilities are properly boys is much greater in the small hos- consequence of the belief in local rep-
utilized; when worthless evidence is pital than in the large one, because a resentativenessis the well-known gam-
given, prior probabilities are ignored large sample is less likely to stray from bler's fallacy. After observing a long
(1). 50 percent. This fundamental notion run of red on the roulette wheel. for
Insensitivity to sample size. To eval- of statistics is evidently not part of example, most people erroneously be-
uate the probabilityof obtaining a par- people's repertoireof intuitions. lieve that black is now due, presumably
ticular result in a sample drawn from A similar insensitivityto sample size because the occurrence of black will
a specified population, people typically has been reported in judgmentsof pos- result in a more representativesequence
apply the representativenessheuristic. terior probability,that is, of the prob- than the occurrence of an additional
That is, they assess the likelihood of ability that a sample has been drawn red. Chance is commonly viewed as a
a sample result, for example, that the from one population rather than from self-correctingprocess in which a devi-
average height in a random sample of another. Consider the following ex- ation in one direction induces a devia-
ten men will be 6 feet (180 centi- ample: tion in the opposite direction to restore
meters), by the similarityof this result the equilibrium.In fact, deviations are
to the corresponding parameter (that Imagine an urn filled with balls, of not "corrected" as a chance process
is, to the average height in the popula- which 2/3 are of one color and ?3 of unfolds, they are merely diluted.
tion of men). The similarity of a sam- another. One individual has drawn 5 balls
,from the urn, and found that 4 were red Misconceptions of chance are not
ple statistic to a population parameter and 1 was white Another individual has limited to naive subjects. A study of
does not depend on the size of the drawn 20 balls and found that 12 were the statistical intuitions of experienced
sample. Consequently, if probabilities red and 8 were white. Which of the two research psychologists (5) revealed a
are assessed by representativeness,then individuals should feel more confident that
the urn contains2/3 red balls and 1/3 white lingering belief in what may be called
the judged probabilityof a sample sta- balls,ratherthanthe opposite?Whatodds the "law of small numbers,"according
tistic will be essentially independentof should each individualgive? to which even small samples are highly
27 SEPTEMBER1974 1125
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
representativeof the populations from dent teacher during a particular prac- whose first-year record consists entirely
which they are drawn. The responses tice lesson. Some subjects were asked of B's than in predicting the grade-
of these investigators reflected the ex- to evaluate the quality of the lesson point average of a student whose first-
pectation that a valid hypothesis about described in the paragraph in percentile year record includes many A's and C's.
a population will be represented by a scores, relative to a specified population. Highly consistent patterns are most
statistically significant result in a sam- Other subjects were asked to predict, often observed when the input vari-
ple-with little regard for its size. As also in percentile scores, the standing ables are highly redundant or correlated.
a consequence, the researchersput too of each student teacher 5 years after Hence, people tend to have great con-
much faith in the results of small sam- the practice lesson. The judgments made fidence in predictions based on redun-
ples and grossly overestimated the under the two conditions were identical. dant input variables. However, an
replicability of such results. In the That is, the prediction of a remote elementary result in the statistics of cor-
actual conduct of research, this bias criterion (success of a teacher after 5 relation asserts that, given input vari-
leads to ithe selection of samples of years) was identical to the evaluation ables of stated validity, a prediction
inadequatesize and to overinterpretation of the information on which the predic- based on several such inputs can
of findings. tion was based (the quality of the achieve higher accuracy when they are
Insensitivity to predictability.People practice lesson). The students who made independent of each other than when
are sometimescalled upon to make such these predictions were undoubtedly they are redundant or correlated. Thus,
numericalpredictionsas the futurevalue aware of the limited predictability of redundancy among inputs decreases
of a stock, the demand for a commod- teaching competence on the basis of a accuracy even as it increases confidence,
ity, or the outcome of a football game. single trial lesson 5 years earlier; never- and people are often confident in pre-
Such predictions are often made by theless, their predictions were as ex- dictions that are quite likely to be off
representativeness.For example, sup- treme as their evaluations. the mark (1).
pose one is given a description of a The illusion of validity. As we have Misconceptions of regression. Suppose
company and is asked to predict its seen, people often predict by selecting a large group of children has been
future profit. If the description of ithe the outcome (for example, an occupa- examined on two equivalent versions of
company is very favorable, a very tion) that is most representative of the an aptitude test. If one selects ten chil-
high profit will appear most represen- input (for example, the description of dren from among those who did best on
tative of that description;if the descrip- a person). The confidence they have one of the two versions, he will usually
tion is mediocre, a mediocre perform- in their prediction depends primarily find their performance on the second
ance will appear most representative. on the degree of representativeness version to be somewhat disappointing.
The degree to which the description is (that is, on the quality of the match Conversely, if one selects ten children
favorableis unaffectedby the reliability between the selected outcome and the from among those who did worst on
of that descriptionor by the degree to input) with little or no regard for the one version, they will be found, on the
which it permits accurate prediction. factors that limit predictive accuracy. average, to do somewhat better on the
Hence, if people predict solely in terms Thus, people express great confidence other version. More generally, consider
of the favorablenessof the description, in the prediction that a person is a two variables X and Y which have the
their predictions will be insensitive to librarian when given a description of same distribution. If one selects indi-
the reliability of the evidence and to his personality which matches the viduals whose average X score deviates
the expected accuracyof the prediction. stereotype of librarians, even if the from the mean of X by k units, then
This mode of judgment violates the description is scanty, unreliable, or out- the average of their Y scores will usual-
normative statistical theory in which dated. The unwarranted confidence ly deviate from the mean of Y by less
the extremeness and the range of pre- which is produced by a good fit between than k units. These observations illus-
dictions are controlledby considerations the predicted outcome and the input trate a general phenomenon known as
of predictability. When predictability information may be called the illusion regression toward the mean, which was
is nil, the same prediction should be of validity. This illusion persists even first documented by Galton more than
made in all cases. For example, if the when the judge is aware of the factors 100 years ago.
descriptions of companies provide no that limit the accuracy of his predic- In the normal course of life, one
informationrelevant to profit, then the tions. It is a common observation that encounters many instances of regression
same value (such as average profit) psychologists who conduct selection toward the mean, in the comparison
should be predicted for all companies. interviews often experience considerable of the height of fathers and sons, of
If predictability is perfect, of course, confidence in their predictions, even the intelligence of husbands and wives,
the values predicted will match the when they know of the vast literature or of the performance of individuals
actual values and the range of predic- that shows selection interviews to be on consecutive examinations. Neverthe-
tions will equal the range of outcomes. highly fallible. The continued reliance less, people do not develop correct in-
In general, the higher the predictability, on the clinical interview for selection, tuitions about this phenomenon. First,
the wider the range of predictedvalues. despite repeated demonstrations of its they do not expect regression in many
Several studies of numerical predic- inadequacy, amply attests to the strength contexts where it is bound to occur.
tion have demonstrated that intuitive of this effect. Second, when they recognize the occur-
predictions violate this rule, and that The internal consistency of a pattern rence of regression, they often invent
subjects show little or no regard for of inputs is a major determinant of spurious causal explanations for it (1).
considerations of predictability (1). In one's confidence in predictions based We suggest that the phenomenon of re-
one of these studies, subjects were pre- on these inputs. For example, people gression remains elusive because it is in-
sented with several paragraphs, each express more confidence in predicting the compatible with the belief that the
describing the performance of a stu- final grade-point average of a student predicted outcome should be maximally
1126 SCIENCE, VOL. 185
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
representativeof the input, and, hence, which instances or occurrences can be begin with r (road) and words that
that the value of the outcome variable broughtto mind. For example, one may have r in the third position (car) and
should be as extreme as the value of assess the risk of heart attack among assess the relative frequency by the
the input variable. middle-aged people by recalling such ease with which words of the two types
The failure to recognize the import occurrencesamong one's acquaintances. come to mind. Because it is much easier
of regression can have pernicious con- Similarly, one may evaluate the proba- to search for words by their first letter
sequences, as illustratedby the follow- bility that a given business venture will than by their third letter, most people
ing observation (1). In a discussion fail by imagining various difficulties it judge words that begin with a given
of flight training, experienced instruc- could encounter. This judgmental heu- consonant to be more numerous than
tors noted that praise for an exception- ristic is called availability. Availability words in which the.same consonant ap-
ally smooth landing is typicallyfollowed is a useful clue for assessing frequency pears in the third position. They do so
by a poorer landing on the next try, or probability, because instances of even for consonants, such as r or k,
while harsh criticism after a rough large classes are usually recalled better that are more frequent in the third
landing is usually followed by an im- and faster than instances of less fre- position than in the first (6).
provementon the next try. The instruc- quent classes. However, availability is Different tasks elicit different search
tors concluded that verbal rewards are affectedby factors other than frequency sets. For example, suppose you are
detrimental to learning, while verbal and probability. Consequently, the re- asked to rate the frequency with which
punishmentsare beneficial, contrary to liance on availabilityleads to predicta- abstractwords (thought, love) and con-
accepted psychological doctrine. This ble biases, some of which are illustrated crete words (door, water) appear in
conclusion is unwarrantedbecause of below. written English. A natural way to
the presence of regression toward ithe Biases due to the retrievability of in- answer this question is to search for
mean. As in other cases of repeated stances. When the size of a class is contexts in which the word could ap-
examination, an improvementwill usu- judged by the availability of its in- pear. It seems easier to think of
ally follow a poor performance and stances, a class whose instances are contexts in which an abstract concept
a deterioration will usually follow an easily retrieved will appear more nu- is mentioned (love in love stories) than
outstanding performance, even if the merous than a class of equal frequency to think of contexts in which a concrete
instructor does not respond to ,the whose instances are less retrievable.In word (such as door) is mentioned. If
trainee's achievement on the first at- an elementarydemonstrationof this ef- the frequencyof words is judged by the
tempt. Because the instructors had fect, subjectsheard a list of well-known availability of the contexts in which
praised their trainees after good land- personalities of both sexes and were they appear, abstract words will be
ings and admonished them after poor subsequentlyasked to judge whether the judgedas relativelymore numerousthan
ones, they reached the erroneous and list contained more names of men than concrete words. This bias has been ob-
potentiallyharmfulconclusion that pun- of women. Differentlists were presented served in a recent study (7) which
ishment is more effective than reward. to differentgroups of subjects. In some showed that the judged frequency of
Thus, the failure to understand the of the lists the men were relativelymore occurrenceof abstractwords was much
effect of regression leads one to over- famous than the women, and in others
estimate the effectiveness of punish- higher than that of concrete words,
the women were relativelymore famous equatedin objectivefrequency.Abstract
ment and to underestimate the effec- than the men. In each of the lists, the
tiveness of reward.In social interaction, words were also judged to appear in a
subjects erroneously judged that the much greater variety of contexts than
as well as in training, rewards are typ- class (sex) that had the more famous concrete words.
ically administered when performance
is good, and punishmentsare typically personalities was the more numerous Biases of imaginability. Sometimes
(6). one has to assess the frequency of a
administered when performance is In addition to familiarity, there areclass whose instances are not stored in
poor. By regression alone, therefore, other factors, such as salience, which
behavior is most likely to improve after memory but can be generated accord-
affect the retrievabilityof instances.For
ing to a given rule. In such situations,
punishmentand most likely to deterio-
rate after reward. Consequently, the example, the impact of seeing a house one typically generatesseveral instances
humancondition is such that, by chance burningon the subjectiveprobabilityof and evaluates frequency or probability
such accidents is probably greater than by the ease with which the relevant in-
alone, one is most often rewarded for
the impact of reading about a fire in stances can be constructed. However,
punishing others and most often pun-
ished for rewarding them. People are the local paper.Furthermore,recent oc- the ease of constructinginstances does
currences are likely to be relatively not alwaysreflecttheir actual frequency,
generallynot aware of this contingency.
In fact, the elusive role of regression more available than earlier occurrences.and this mode of evaluation is prone
in determining the apparent conse- It is a common experience that the to biases. To illustrate,consider a group
quences of reward and punishment subjectiveprobabilityof trafficaccidentsof 10 people who form committees of
seems to have escaped the notice of stu- rises temporarilywhen one sees a car k members, 2 < k < 8. How many
dents of this area. overturned by the side of the road. differentcommittees of k members can
Biases due to the effectiveness of a be formed? The correct answer to this
search set. Suppose one samples a word problem is given by the binomial coef-
Availability (of three letters or more) at random ficientri
-kr\/
(10) which
rrivrI/~il reaches
O a maximum
IULLUL
from an English text. Is it more likely of 252 for k = 5. Clearly, the number
There are situations in which people that the word starts with r or that of committees of k members equals
assess the frequency of a class or the r is the third letter? People approach the number of committees of (10 - k)
probabilityof an event by the ease with this problem by recalling words that members, because any committee of k
27 SEPTEMBER 1974 1127
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
members defines a unique group of natural associates, such as suspicious- That is, different starting points yield
(10 - k) nonmembers. ness and peculiar eyes. This effect was different estimates, which are biased
One way to answer this question with- labeled illusory correlation.In their er- toward the initial values. We call this
out computation is to mentally con- roneousjudgmentsof the data to which phenomenon anchoring.
struct committees of k members and they had been exposed, naive subjects Insufficient adjustment. In a demon-
to evaluate their number by the ease "rediscovered"much of the common, stration of the anchoring effect, subjects
with which they come to mind. Com- but unfounded, clinical lore concern- were asked to estimate various quanti-
mittees of few members, say 2, are ing the interpretation of the draw-a- ties, stated in percentages (for example,
more available than committees of many person test. The illusory correlation the percentage of African countries in
members, say 8. The simplest scheme effect was extremely resistant to con- the United Nations). For each quantity,
for the construction of committees is a tradictory data. It persisted even when a number between 0 and 100 was deter-
partition of the group into disjoint sets. the correlation between symptom and mined by spinning a wheel of fortune
One readily sees that it is easy to con- diagnosis was actually negative, and it in the subjects' presence. The subjects
struct five disjoint committees of 2 prevented the judges from detecting were instructed to indicate first whether
members, while it is impossible to gen- relationshipsthat were in fact present. that number was higher or lower than
erate even two disjoint committees of Availability provides a natural ac- the value of the quantity, and then to
8 members. Consequently, if fre- count for the illusory-correlationeffect. estimate the value of the quantity by
quency is assessed by imaginability, or The judgment of how frequently two moving upward or downward from the
by availability for construction, the events co-occur could be based on the given number. Different groups were
small committees will appear more num- strengthof the associativebond between given different numbers for each quan-
erous than larger committees, in con- them. When the association is strong, tity, and these arbitrary numbers had a
trast to the correct bell-shaped func- one is likely to conclude that the events marked effect on estimates. For example,
tion. Indeed, when naive subjects were have been frequently paired. Conse- the median estimates of the percentage
asked to estimate the number of distinct quently, strong associateswill be judged of African countries in the United Na-
committees of various sizes, their esti- to have occurred together frequently. tions were 25 and 45 for groups that re-
mates were a decreasing monotonic According to this view, the illusory ceived 10 and 65, respectively, as start-
function of committee size (6). For correlation between suspiciousness and ing points. Payoffs for accuracy did not
example, the median estimate of the peculiar drawing of the eyes, for ex- reduce the anchoring effect.
number of committees of 2 members ample, is due to the fact that suspi- Anchoring occurs not only when the
was 70, while the estimate for com- ciousnessis more readilyassociatedwith starting point is given to the subject,
mittees of 8 members was 20 (the cor- the eyes than with any other part of but also when the subject bases his
rect answer is 45 in both cases). the body. estimate on the result of some incom-
Imaginability plays an important role Lifelong experience has taught us plete computation. A study of intuitive
in the evaluation of probabilities in real- that, in general, instances of large numerical estimation illustrates this ef-
life situations. The risk involved in an classes are recalled better and faster fect. Two groups of high school students
adventurous expedition, for example, is than instances of less frequent classes; estimated, within 5 seconds, a numerical
evaluated by imagining contingencies that likely occurrences are easier to expression that was written on the
with which the expedition is not imagine than unlikely ones; and that blackboard. One group estimated the
equipped to cope. If many such difficul- the associative connections between product
ties are vividly portrayed, the expedi- events are strengthenedwhen the events
tion can be made to appear exceedingly frequently co-occur. As a result, man 8X7X6XSX4x3X2X1
dangerous, although the ease with which has at his disposal a procedure (the while another group estimated the
disasters are imagined need not reflect availabilityheuristic)for estimating the product
their actual likelihood. Conversely, the numerosityof a class, the likelihood of
risk involved in an undertaking may be an event, or the frequency of co-occur- 1x2x3x4x5X6x7X8
grossly underestimated if some possible rences, by the ease with which the To rapidly answer such questions, peo-
dangers are either difficult to conceive relevant mental operations of retrieval, ple may perform a few steps of compu-
of, or simply do not come to mind. construction, or association can be tation and estimate the product by
Illusory correlation. Chapman and performed. However, as the preceding extrapolation or adjustment. Because ad-
Chapman (8) have described an interest- examples have demonstrated,this valu- justments are typically insufficient, this
ing bias in the judgment of the fre- able estimation procedure results in procedure should lead to underestima-
quency with which two events co-occur. systematic errors. tion. Furthermore, because the result of
They presented naive judges with in- the first few steps of multiplication (per-
formation concerning several hypothet- formed from left to right) is higher in
ical mental patients. The data for each Adjustment and Anchoring the descending sequence than in the
patient consisted of a clinical diagnosis ascending sequence, the former expres-
and a drawing of a person made by In many situations,people make esti- sion should be judged larger than the
the patient. Later the judges estimated mates by starting from an initial value latter. Both predictions were confirmed.
the frequency with which each diagnosis that is adjustedto yield the final answer. The median estimate for the ascending
(such as paranoia or suspiciousness) The initial value, or starting point, may sequence was 512, while the median
had been accompanied by various fea- be suggested by the formulation of the estimate for the descending sequence
tures of the drawing (such as peculiar problem, or it may be the result of a was 2,250. The correct answer is 40,320.
eyes). The subjects markedly overesti- partial computation. In either case, Biases in the evaluation of conjunc-
mated the frequency of co-occurrence of adjustments are typically insufficient (4). tive and disjunctive events. In a recent
1128 SCIENCE, VOL. 185
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
study by Bar-Hillel (9) subjects were large. The general tendency to overesti- tained probabilitydistributionsfor many
given the opportunityto bet on one of mate the probability of conjunctive quantities from a large number of
two events. Three types of events were events leads to unwarrantedoptimism in judges. These distributions indicated
used: (i) simple events, such as drawing the evaluation of the likelihood that a large and systematic departures from
a red marble from a bag containing 50 plan will succeed or that a project will proper calibration.In most studies, the
percent red marbles and 50 percent be completed on time. Conversely, dis- actual values of the assessed quantities
white marbles; (ii) conjunctive events, junctive structuresare typically encoun- are either smaller than X0l or greater
such as drawing a red marble seven tered in the evaluation of risks. A com- than X09 for about 30 percent of the
times in succession, with replacement, plex system, such as a nuclear reactor problems. That is, the subjects state
from a bag containing 90 percent red or a human body, will malfunction if overlynarrowconfidenceintervalswhich
marbles and 10 percent white marbles; any of its essential components fails. reflectmore certaintythan is justifiedby
and (iii) disjunctive events, such as Even when the likelihood of failure in their knowledge about the assessed
drawing a red marble at least once in each componentis slight,the probability quantities. This bias is common to
seven successivetries, with replacement, of an overall failure can be high if naive and to sophisticatedsubjects, and
from a bag containing 10 percent red many components are involved. Be- it is not eliminatedby introducingprop-
marbles and 90 percent white marbles. cause of anchoring, people will tend to er scoringrules,which provideincentives
In this problem, a significant majority underestimatethe probabilitiesof failure for externalcalibration.This effect is at-
of subjectspreferredto bet on the con- in complex systems. Thus, the direc- tributable,in part at least, to anchoring.
junctive event (the probabilityof which tion of the anchoring bias can some- To select X90 for the value of the
is .48) rather than on the simple event times be inferred from the structureof Dow-Jones average, for example, it is
(the probability of which is .50). Sub- the event. The chain-like structure of naturalto begin by thinking about one's
jects also preferredto bet on the simple conjunctionsleads to overestimation,the best estimate of the Dow-Jones and to
event rather than on the disjunctive funnel-like structure of disjunctions adjust this value upward. If this adjust-
event, which has a probability of .52. leads to underestimation. ment-like most others-is insufficient,
Thus, most subjectsbet on the less likely Anchoring in the assessment of sub- then X9owill not be sufficientlyextreme.
event in both comparisons.This pattern jective probability distributions. In deci- A similar anchoring effect will occur in
of choices illustrates a general finding. sion analysis, experts are often required the selectionof X0,, which is presumably
Studies of choice among gambles and to expresstheir beliefs about a quantity, obtained by adjusting one's best esti-
of judgments of probability indicate such as the value of the Dow-Jones mate downward.Consequently,the con-
that people tend to overestimate the average on a particular day, in the fidence interval between X1O and X90
probability of conjunctive events (10) form of a probabilitydistribution.Such will be too narrow, and the assessed
and to underestimatethe probabilityof a distributionis usually constructed by probabilitydistributionwill be too tight.
disjunctive events. These biases are asking the person to select values of In support of this interpretationit can
readily explained as effects of anchor- the quantitythat correspondto specified be shown that subjective probabilities
ing. The stated probability of the percentilesof his subjective probability are systematically altered by a proce-
elementary event (success at any one distribution. For example, the judge dure in which one's best estimate does
stage) provides a natural starting point may be asked to select a number, X90, not serve as an anchor.
for the estimationof the probabilitiesof such that his subjectiveprobabilitythat Subjective probability distributions
both conjunctiveand disjunctiveevents. this number will be higher than, the for a given quantity (the Dow-Jones
Since adjustmentfrom the startingpoint value of the Dow-Jones average is .90. average) can be obtained in two differ-
is typically insufficient, the final esti- That is, he should select the value X90 ent ways: (i) by asking the subject to
mates remain too close to the probabili- so that he is just willing to accept 9 to select values of the Dow-Jones that
ties of the elementary events in both 1 odds that the Dow-Jones average will correspond to specified percentiles of
cases. Note that the overall probability not exceed it. A subjective probability his probabilitydistributionand (ii) by
of a conjunctive event is lower than distributionfor the value of the Dow- asking the subject to assess the prob-
the probability of each elementary Jones average can be constructedfrom abilities that the true value of the
event, whereasthe overall probabilityof severalsuch judgmentscorrespondingto Dow-Jones will exceed some specified
a disjunctive event is higher than the different percentiles. values. The two proceduresare formally
probability of each elementary event. By collecting subjective probability equivalent and should yield identical
As a consequence of anchoring, the distributionsfor many different quanti- distributions.However, they suggest dif-
overall probabilitywill be overestimated ties, it is possible to test the judge for ferent modes of adjustmentfrom differ-
in conjunctive problems and underesti- proper calibration.A judge is properly cent anchors. In procedure (i), the
mated in disjunctive problems. (or externally) calibrated in a set of natural starting point is one's best esti-
Biases in the evaluationof compound problems if exactly II percent of the mate of the quantity. In procedure (ii),
events are particularlysignificantin the true values of the assessed quantities on the other hand, the subject may be
context of planning. The successful falls below his stated values of Xr. For anchored on the value stated in the
completion of an undertaking,such as example, the true values should fall question. Alternatively, he may be an-
the developmentof a new product,typi- below X0l for 1 percent of the quanti- chored on even odds, or 50-50 chances,
cally has a conjunctive character: for ties and above X99 for 1 percent of the which is a natural starting point in the
the undertakingto succeed, each of a quantities.Thus, the true values should estimation of likelihood. In either case,
series of events must occur. Even when fall in the confidence interval between procedure (ii) should yield less extreme
each of these events is very likely, the X01 and X99 on 98 percent of the prob- odds than procedure (i).
overall probability of success can be lems. To contrast the two procedures, a
quite low if the number of events is Several investigators (11) have ob- set of 24 quantities(such as the air dis-
27 SEPTEMBER1974 1129
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
tance from New Delhi to Peking) was It is not surprisingthat useful heuris- subjective interpretationof probability
presented to a group of subjects who tics such as representativeness and that is applicableto unique events and
assessedeitherXI0 or X90 for each prob- availability are retained, even though is embedded in a general theory of ra-
lem. Another group of subjects re- they occasionally lead to errors in pre- tional decision.
ceived the median judgmentof the first diction or estimation. What is perhaps It should perhapsbe noted that, while
group for each of the 24 quantities. surprising is the failure of people to subjective probabilities can sometimes
They were asked to assess the odds that infer from lifelong experience such be inferred from preferences among
each of the given values exceeded the fundamental statistical rules as regres- bets, they are normally not formed in
true value of the relevant quantity. In sion toward the mean, or the effect of this fashion. A person bets on team A
the absence of any bias, the second sample size on samplingvariability.Al- rather than on team B because he be-
group should retrieve the odds specified though everyone is exposed, in the nor- lieves that team A is more likely to
to the first group, that is, 9:1. How- mal course of life, to numerous ex- win; he does not infer this belief from
ever, if even odds or the stated value amples from which these rules could his bettingpreferences.Thus, in reality,
serve as anchors, the odds of the sec- have been induced, very few people subjective probabilitiesdetermine pref-
ond group should be less extreme, that discover the principles of sampling and erences among bets and are not de-
is, closer to 1:1. Indeed, the median regressionon their own. Statisticalprin- rived from them, as in the axiomatic
odds stated by this group, across all ciples are not learned from everyday theory of rational decision (12).
problems, were 3:1. When the judg- experience because the relevant in- The inherently subjective nature of
ments of the two groups were tested stances are not coded appropriately.For probabilityhas led many studentsto the
for external calibration, it was found example, people do not discover that belief that coherence, or internal con-
that subjects in the first group were too successive lines in a text differ more in sistency, is the only valid criterion by
extreme, in accord with earlier studies. average word length than do successive which judged probabilities should be
The events that they defined as having pages, because they simply do not at- evaluated. From the standpoint of the
a probabilityof .10 actuallyobtained in tend to the average word length of in- formal theory of subjectiveprobability,
24 percent of the cases. In contrast, dividual lines or pages. Thus, people any set of internallyconsistentprobabil-
subjects in the second group were too do not learn the relationbetweensample ity judgmentsis as good as any other.
conservative. Events to which they as- size and sampling variability, although This criterionis not entirelysatisfactory,
signed an average probability of .34 the data for such learningare abundant. because an internally consistent set of
actually obtained in 26 percent of the The lack of an appropriatecode also subjective probabilities can be incom-
cases. These results illustrate the man- explains why people usually do not patible with other beliefs held by the
ner in which the degree of calibration detect the biases in their judgments of individual. Consider a person whose
dependson the procedureof elicitation. probability.A person could conceivably subjective probabilitiesfor all possible
learn whether his judgments are exter- outcomes of a coin-tossinggame reflect
nally calibratedby keeping a tally of the the gambler'sfallacy. That is, his esti-
Discussion proportionof events that actually occur mate of the probability of tails on a
among those to which he assigns the particulartoss increases with the num-
This article has been concerned with same probability. However, it is not ber of consecutive heads that preceded
cognitive biases that stem from the reli- natural to group events by their judged that toss. The judgmentsof such a per-
ance on judgmental heuristics. These probability. In the absence of such son could be internally consistent and
biases are not attributable to motiva- grouping it is impossible for an indivi- therefore acceptable as adequate sub-
tional effects such as wishful thinkingor dual to discover, for example, that only jective probabilities according to the
the distortion of judgments by payoffs 50 percent of the predictions to which criterion of the formal theory. These
and penalties. Indeed, several of the he has assigned a probability of .9 or probabilities,however, are incompatible
severe errors of judgment reported higher actually came true. with the generally held belief that a
earlier occurred despite the fact that The empirical analysis of cognitive coin has no memoryand is thereforein-
subjectswere encouragedto be accurate biases has implicationsfor the theoreti- capable of generating sequential de-
and were rewarded for the correct cal and appliedrole of judgedprobabili- pendencies. For judged probabilitiesto
answers (2, 6). ties. Modern decision theory (12, 13) be consideredadequate,or rational, in-
The reliance on heuristics and the regards subjective probability as the ternal consistency is not enough. The
prevalence of biases are not restricted quantified opinion of an idealized per- judgmentsmust be compatiblewith the
to laymen. Experiencedresearchersare son. Specifically, the subjective proba- entire web of beliefs held by the in-
also prone to the same biases-when bility of a given event is defined by the dividual. Unfortunately, there can be
they think intuitively.For example, the set of bets about this event that such a no simple formal procedure for assess-
tendency to predict the outcome that person is willing to accept. An inter- ing the compatibilityof a set of proba-
best representsthe data, with insufficient nally consistent, or coherent, subjective bility judgments with the judge's total
regard for prior probability, has been probabilitymeasure can be derived for system of beliefs. The rational judge
observed in the intuitive judgments of an individualif his choices among bets will neverthelessstrivefor compatibility,
individuals who have had extensive satisfy certain principles, that is, the even though internal consistency is
training in statistics (1, 5). Although axioms of the theory. The derivedprob- more easily achieved and assessed. In
the statistically sophisticated avoid ability is subjective in the sense that particular,he will attempt to make his
elementaryerrors,such as the gambler's differentindividualsare allowed to have probability judgments compatible with
fallacy, their intuitive judgments are differentprobabilitiesfor the same event. his knowledge about the subject mat-
liable to similar fallacies in more in- The major contribution of this ap- ter, the laws of probability,and his own
tricate and less transparentproblems. proach is that it provides a rigorous judgmental heuristics and biases.
1130 SCIENCE, VOL. 185
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions
and usually effective, but they lead to 8. L. J. Chapman and J. P. Chapman, J.
Summary Abnorm. Psychol. 73, 193 (1967); ibid., 74,
systematic and predictable errors. A 271 (1969).
This article describedthree heuristics better understandingof these heuristics 9. M. Bar-Hillel, Organ. Behav. Hum. Per-
formance 9, 396 (1973).
that are employed in making judgments and of the biases to which they lead 10. J. Cohen, E. I. Chesnick, D. Haran, Br. J.
underuncertainty:(i) representativeness, could improve judgmentsand decisions Psychol. 63, 41 (1972).
11. M. Alpert and H. Raiffa, unpublished manu-
which is usually employed when peo- in situations of uncertainty. script; C. A. S. von Holstein, Acta Psychol.
35, 478 (1971); R. L. Winkler, J. Am. Stat.
ple are asked to judge the probability References and Notes
Assoc. 62, 776 (1967).
that an object or event A belongs to 12. L. J. Savage, The Foundations of Statistics
D. (Wiley, New York, 1954).
class or process B; (ii) availabilityof in- 1. Kahneman and A. Tversky, Psychol. Rev.
13. B. De Finetti, in International Encyclopedia
80, 237 (1973).
stances or scenarios,which is often em- 2. - , Cognitive Psychol. 3, 430 (1972). of the Social Sciences, D. E. Sills, Ed. (Mac-
millan, New York, 1968), vol. 12, pp. 496-
ployed when people are asked to assess 3. W. Edwards, in Formal Representation of
Human Judgment, B. Kleinmuntz, Ed. (Wiley,
504.
the frequencyof a class or the plausibil- New York, 1968), pp. 17-52.
14. This research was supported by the Advanced
Research Projects Agency of the Department
ity of a particular development; and 4. P. Slovic and S. Lichtenstein, Organ. Behav.
Hum. Performance 6, 649 (1971).
of Defense and was monitored by the Office
of Naval Research under contract N00014-
(iii) adjustmentfrom an anchor, which 5. A. Tversky and D. Kahneman, Psychol. Bull. 73-C-0438 to the Oregon Research Institute,
is usuallyemployedin numericalpredic- 76, 105 (1971). Eugene. Additional support for this research
6. ---- Psychol. 5, 207 (1973). was provided by the Research and Develop-
tion when a relevant value is available. 7. R. C., Cognitive
Galbraith and B. J. Underwood, ment Authority of the Hebrew University,
These heuristics are highly economical Mem. Cognition 1, 56 (1973). Jerusalem, Israel.
This content downloaded from 134.117.10.200 on Sat, 13 Feb 2016 15:33:44 UTC
All use subject to JSTOR Terms and Conditions