Cognitive Psychology and Self-Reports: Models and Methods
Cognitive Psychology and Self-Reports: Models and Methods
219
Ó 2003 Kluwer Academic Publishers. Printed in the Netherlands.
Jared B. Jobe
National Heart, Lung, and Blood Institute, Bethesda, MD, USA (E-mail: [email protected])
Abstract
This article describes the models and methods that cognitive psychologists and survey researchers use to
evaluate and experimentally test cognitive issues in questionnaire design and subsequently improve self-
report instruments. These models and methods assess the cognitive processes underlying how respondents
comprehend and generate answers to self-report questions. Cognitive processing models are briefly de-
scribed. Non-experimental methods – expert cognitive review, cognitive task analysis, focus groups, and
cognitive interviews – are described. Examples are provided of how these methods were effectively used to
identify cognitive self-report issues. Experimental methods – cognitive laboratory experiments, field tests,
and experiments embedded in field surveys – are described. Examples are provided of: (a) how laboratory
experiments were designed to test the capability and accuracy of respondents in performing the cognitive
tasks required to answer self-report questions, (b) how a field experiment was conducted in which a
cognitively designed questionnaire was effectively tested against the original questionnaire, and (c) how a
cognitive experiment embedded in a field survey was conducted to test cognitive predictions.
Key words: Autobiographical memory, Cognitive interviews, Focus groups, Information processing
models
processes to estimate an answer, and (4) formula- methodologists to identify cognitive response
tion of a response. This Four-Stage Model, pro- problems with self-report questions and to im-
posed by Tourangeau [8], is the most frequently prove the quality of data collected in surveys and
cited model. Six of the seven models propose se- epidemiologic studies, an area of research known
quential processing – that is, each stage or task as the cognitive aspects of survey methodology.
must be carried out before the next one can begin. Willis et al. [9] divide the methods used in this
One model, the Flexible Processing Model (Fig- interdisciplinary field into two categories: evalua-
ure 1), proposes that processing is not necessarily tive techniques and experimental techniques.
automatic and that control processes can occur Evaluative methods are non-experimental and in-
both before and after retrieval from memory [9]. clude expert cognitive assessment of a question-
Another model, the Survey Interaction Model, naire [11, 12], cognitive task analysis [13, 14], focus
considers that cognitive processing alone does not groups [15, 18], and cognitive interviews [9, 15, 18–
account for all responses to the question answering 22]. Data from non-experimental methods are
process – other psychological processes, such as usually qualitative. Experimental techniques are
motivation, emotion, sensation, and personality, based on hypothesis testing and include cognitive
as well as biological systems and characteristics, experiments conducted in laboratories [23, 24],
such as gender and fatigue – are important in the cognitive field experiments [25, 26], and cognitive
question-answering process [10]. The Survey In- experiments embedded in field surveys [27]. Ex-
teraction Model also postulates a brief reorienta- perimental methods can also be considered to be
tion state that occurs after individual questions evaluative in the sense that they can be used to
have been asked and answered, explaining mental evaluate the capabilities of respondents to answer
shifts that occur when sensitive questions are questions or compare the accuracy of responses to
asked. The common weakness of all the models is different questionnaires. Data from experimental
that they are explanatory models and not predic- methods are usually quantitative. An effective
tive models – none differentially predict responses process is to first develop new questions using the
to survey questions. non-experimental methods [22] and then compare
For about the past 15 years these cognitive them in a field study against the old version of the
models, along with theories and methods from questionnaire using validation methods [26]. Vali-
cognitive psychology, have been used with great dation methods are not always available, such as
success by cognitive psychologists and survey for some sensitive questions (e.g., drug use, sexual
Figure 1. Flexible processing model of the survey response process (adapted from Willis, et al. [9]; in the public domain).
221
partners) or questions that require the respondent least well remembered retrieval cue for recalling
to provide an opinion or evaluative judgement information about an event from memory [28, 29].
(e.g., qualify of life questions asking for ratings of Therefore, to improve reporting of smoking ces-
health). sation attempts, the question sequence should be
revised so that all of the details about the smoking
cessation attempt are recalled before the respon-
Non-experimental methods dent is asked to recall the date. Under the heading,
Field Experiments, later in this article, I describe
Cognitive review of questionnaires how the question sequence was changed and tested.
useful early in the questionnaire development bothered by the pain because he knew it was his
process. Cognitive researchers did not invent focus gall bladder. He was exactly the kind of respon-
groups, which have been used for many years, but dent for whom the questionnaire was designed and
have employed them in new ways. Focus groups he would have been screened out of the question-
can be used to elucidate both issues that are es- naire. Respondents thought the term ‘troubled by’
sentially cognitive, such as whether respondents was vague. To determine whether respondents
understand the terminology used in the draft understood the correct area for the abdomen, they
questionnaire and what retrieval cues might be were then given a diagram of the human torso
used to improve recall, and issues that are both divided into 20 blocks and were asked to shade the
cognitive and social-motivational, such as whether abdomen. No two people shaded the same area
respondents are likely to answer sensitive ques- and no one shaded the correct area. The solution
tions honestly, and whether respondents consider was to present the respondents a show card with
questions to be sensitive. the correct area of the abdomen shaded and ask
For example, Sudman et al. [17] conducted two them if they had pain in that area. The vague term
focus groups with women aged 50 and over re- ‘troubled by’ was omitted.
garding cancer detection tests, and found that re-
spondents did not consider Pap smears, Cognitive interviews
mammograms, and breast exams to be sensitive
topics. They also found that women had very With or without the use of focus groups, intensive
different interpretations of the term ‘physical ex- one-on-one interviews are conducted using extant
aminations,’ and for many women how they re- or draft questionnaires [9, 15, 18–22]. These cog-
called these tests depended upon whether they nitive interviews can include several techniques,
were part of an annual physical examination or such as a concurrent think-aloud in which the re-
were scheduled separately. spondents are asked to think aloud when answer-
Pratt et al. [16] conducted four focus groups to ing the questions [32]. Followup probe questions
understand the major concerns that women of are used extensively to understand cognitive pro-
childbearing age had about sensitive questions and cesses when answering a particular question: What
the procedures used to collect sensitive data. They does the term, ‘abdomen’ mean to you; how do
found that the issue of confidentiality was of the you remember how many times you went to the
most concern to the respondents. Fear that their dentist; or how sure are you of your answer?
responses to questions about sexual behavior, (p. 253) [9]. Think-aloud interviews help investi-
abortions, and drug use would be overheard or gators understand how respondents cognitively
revealed raised questions about how the data process the questions and answers, such as whether
would be collected and stored. Moreover, they respondents understand the questions in the same
were concerned about the privacy of these topics. way as the question designer, how respondents re-
Respondents described the important qualities call information, and whether respondents recall
that an interviewer should have and rated pictures information or simply guess (p. 243) [15].
of prospective interviewers. They revealed a pref- Jobe and Mingay [19] interviewed older adults
erence for women interviewers who matched their about their activities of daily living, instrumental
ethnicity. The results of the focus groups were used activities of daily living, and social support using
to design a field study to investigate variables such think-aloud interviews with probes. They found
as mode of data collection and type of interviewer. that respondents answered more in narrative than
Bercini [15] described an example of compre- in categorical terms, did not consider qualifiers in
hension problems that occur when the terms used formulating answers, varied in their interpretation
are vague or unfamiliar. Twelve focus group par- of questions and terminology, such as ‘aid from
ticipants were asked the question, ‘During the past another person’, and answered from a perspective
year have you been troubled by pain in your ab- of capacity to perform tasks, rather than actual
domen?’ The question was a screener for a ques- performance of those tasks.
tionnaire on gall bladder disease. One respondent Keller et al. [20] followed up this project with
who had gall bladder disease said he wasn’t another project using a concurrent think-aloud
223
protocol, an extensive set of scripted followup sitive to portion size definitions, probably because
probes, and showcards containing lists of reasons they do not have a cognitive representation of
for activities and meanings of terminology. They most food portions.
found that respondents did not interpret words or The third and fourth experiments concerned
phrases such as ‘aids’, ‘difficulty’, and ‘help from long-term recall of diet [24]. In experiment three,
another person’ as intended by the question de- subjects kept food diaries for 2 or 4 weeks, re-
signers. Respondents did not appear to recall from turning 0, 2, 3, or 6 weeks after the end of the
memory all the kinds of help they received, nor did diary period and were asked recall their diets.
they report their health as the reason for needing Memory performance deteriorated as a function of
assistance with activities. Finally, responses to the retention interval. Intrusion rates for those
many functional status questions were often con- who kept diaries for 2 weeks increased as a func-
ditional upon the circumstances. tion of the retention interval, but did not for those
in the 4 week diary group.
In experiment four, subjects kept food diaries
Experimental methods for two separate periods and reported for both
periods at the end of the second diary period. The
Cognitive laboratory experiments recalled items for the second diary period better
matched the food diary for the second diary period
Cognitive laboratory experiments are designed af- than the recalled items for the first diary period
ter completion of the evaluative methods. For ex- matched the first food diary. However, recalled
ample, after the cognitive tasks are identified, items for the first diary period equally matched
experiments can be designed to test the accuracy both food diaries. The results of experiments three
with which respondents can perform the cognitive and four suggest that respondents rely on generic
task, or even the ability of respondents to perform memory about their own diets when they attempt
the task at all. Smith et al. [23, 24] designed a series to report their food intake for extended or remote
of cognitive laboratory experiments to test the ca- periods of time.
pabilities of respondents to perform the cognitive
tasks involved in reporting dietary intake. The first Field experiments
two experiments concerned frequency judgments
and magnitude estimation, respectively [23]. In the In a cognitive field experiment, a questionnaire
first experiment, around 400 college students were that has been evaluated using some or all of
asked to think about either the most recent occa- the cognitive methods described above is tested
sion on which they had eaten certain foods or against the original questionnaire, with respon-
about all of the occasions on which they had eaten dents questioned in field settings (homes, offices, or
the same foods. Their frequency judgments were public places) [25, 26]. When possible, a validation
affected by their thoughts before answering the technique is utilized, such as a food or pain diary,
question: Those students instructed to think about health records, or records from another research
the range of occasions on which they had eaten the project. (Record sources are sometimes problem-
foods reported higher consumption than those in- atic themselves.) Respondents’ answers to key
structed to think only about the most recent oc- questions are compared to the record source, and
casion on which they had eaten the foods. the version that corresponds most closely to the
In the second experiment, the same college stu- record source is considered to be the most accurate
dents made judgments about their usual portion questionnaire version. With sensitive questions,
sizes of four foods (from a set of eight foods) rel- such as drug use or sexual behavior, record sources
ative to portion sizes described as one of three are usually not available to validate respondents’
different values. As Table 1 below shows, only for answers, and the questionnaire version that yields
fruit juice were their judgments of their usual the most frequent responses is considered to be the
portion sizes significantly affected by differences in most accurate [16].
the described standards in the predicted direction. As described in the section on Cognitive Review
The results suggest that respondents are not sen- of Questionnaires, Means et al. [25] judged that the
224
Table 1. Response distributions of reported portion size, by food item and definition of medium, from Smith et al. [23] in the public
domain
National Health Interview Survey smoking ques- experiment. The 56 respondents had participated
tions asked the respondent to date the smoking in a stop smoking program 7–9 years earlier, and
cessation attempt early in the series of questions, records were available from their participation in
whereas the cognitive literature indicated that the that intervention study. About half of these re-
event should be recalled first and dated last [28, 29]. spondents had not smoked again, whereas the
The question sequence was then redesigned so other half had relapsed. The cognitive version of
to be compatible with memory retrieval processes. the questionnaire overall resulted in significantly
Questions were first asked about why the respon- more accurate recall of the date of the cessation
dent tried to stop smoking, followed by how they attempt than the original questionnaire: the mean
tried to stop smoking, the result of the attempt, and average deviation in months from the actual date
finally, when the cessation attempt occurred. Thus, was 4.99 for the cognitive group and 10.74 for the
the episode and the circumstances surrounding the original group. For respondents who had not
episode were recalled before dating the attempt. smoked again, the questionnaire version did not
Cognitive questions were also developed to ask affect their accuracy in dating the successful ces-
unsuccessful quitters about their smoking history sation attempt; the mean average deviation in
after the cessation attempt. months from the actual date was 3.95 for the
The original and cognitively redesigned ques- cognitive group and 3.58 for the original group.
tionnaire versions were then tested in a small field This is not surprising because the date they stop-
225
ped smoking is a highly salient event and is likely or uninstructed recall. The experiment was moti-
to be mentally rehearsed over and over again. vated by equivocal results in the cognitive litera-
However, if the quit attempt was not successful, ture regarding which method of recall should lead
then the questionnaire version had a significant to superior reporting: forward recall could be su-
effect: the mean average deviation in months from perior because recall order matches the sequence
the actual date was 5.90 for the cognitive group of experience; backward recall could be superior
and 17.91 for the original group. Unsuccessful because later events are better remembered and
quitters interviewed using the cognitive version could be used as retrieval cues to recall the earlier
also recalled their smoking history after the ces- visits; or neither may be better because the ad-
sation attempt more accurately than those inter- vantages balance.
viewed with the original questionnaire. Thus, the At the conclusion of the interview, respondents
cognitive methods used to evaluate and redesign were asked to sign permission forms allowing ac-
the questionnaire were largely effective in im- cess to their medical records. Five hundred and
proving the accuracy of the smoking questions. seven interviews were completed, with 337 re-
spondents reporting at least one medical visit
Experiments embedded in field surveys during the 6-month reference period, and naming
542 providers of whom 335 provided records.
Similar to the field experiment is the cognitive Complete medical records were obtained for 130
experiment embedded in a field survey. The dif- respondents, of whom 75 had multiple visits. Self-
ferences are mainly of scale and complexity. In the reports were matched to records for date and
field survey, different versions of the experimental reason; inter-rater reliability was 85%. When the
questionnaire are embedded into a larger survey. analysis was restricted to those who reported be-
Sampling is often different in that the field survey tween two and five visits, respondents not in-
will often use a scientifically selected sample, structed in recall order correctly recalled
whereas the field experiment will often use a con- significantly more of their visits (69%), than re-
venience sample. (See Thompson et al. [26] for an spondents instructed to recall in forward (38%) or
example of a field experiment with a sample de- backward order (38%). When those with more
sign.) The field survey uses professional inter- than five visits were included in the analyses, the
viewers, whereas the field experiment often uses results were not statistically significant. The results
research staff. Finally, the field survey often uses a are consistent with those of Means and Loftus [33]
significantly larger sample. who found respondents could not recall individual
Jobe et al. [27] conducted an embedded experi- medical visits, only typical visits when five or more
ment in the National Health Interview Survey/ visits occurred for a chronic condition. Their
National Medical Expenditure Survey Linkage memory appeared to be generic rather than epi-
Field Test. Demographic groups tending to have sodic.
more health problems and less access to care were
oversampled. Interviews were conducted in re-
spondents’ households by professional interview- Conclusions
ers. Embedded into the survey was a sequence of
questions about medical provider visits during the The non-experimental and experimental ap-
previous 6 months. Respondents were asked to proaches described in this article demonstrate that
recall the month and day of each visit as well as the a cognitive approach to improve the quality of
reason for each visit; the experimental sequence self-reports in health surveys and epidemiologic
was the first set of questions in the survey. The studies is effective in identifying the cognitive tasks
experiment involved the instructions to the re- that these instruments ask respondents to perform,
spondents about the order of recalling the medical and in determining the limitations of respondents
provider visits. Households were randomly as- in performing these tasks. More importantly,
signed to receive instructions to recall the visits in cognitive models, methods, and previous cognitive
one of three ways: a forward chronological order, research have identified promising approaches
backward or reverse chronological order, or free to improve the accuracy of self-reports and in
226
demonstrating their effectiveness in field experi- 3. Jobe JB, Mingay DJ. Cognition and survey measurement:
ments and experiments embedded in field surveys History and overview. Appl Cognit Psychol 1991; 5: 175–
192.
using validation data. 4. Jobe JB, Tourangeau R, Smith AF. Contributions of sur-
Self-report questions ask respondents for four vey research to the understanding of memory. Appl Cognit
types of responses: knowledge, behavior, opinion, Psychol 1993; 7: 567–584.
and proxy [21]. The vast majority of previous re- 5. Sudman S, Bradburn N, Schwarz N. Thinking about An-
search on improving questionnaire design has been swers: The Application of Cognitive Processes to Survey
Methodology. San Francisco: Jossey-Bass, 1995.
conducted on questions that request behavioral 6. Tourangeau R, Rips LJ, Rasinski K. The Psychology of
information, which asks respondents to recall in- Survey Response. New York: Cambridge University Press,
formation from autobiographical memory. Health 2000.
Related Quality of Life questionnaires often ask 7. Jobe JB, Herrmann DJ. Implications of models of survey
respondents for more opinion or subjective re- cognition for memory theory. In: Herrmann D, Johnson
M, McEvoy C, Hertzog C, Hertel P (eds), Basic and Ap-
sponses: rate their overall health status, rate their plied Memory Research: Vol. 2. Practical Applications.
psychological well-being, rate their social rela- Hillsdale, NJ: Erlbaum, 1996; 193–205.
tions/support, and judge their ability to carry out 8. Tourangeau R. Cognitive sciences and survey methods. In:
activities of daily living. Some significant research Jabine TB, Straf ML, Tanur JM, Tourangeau R (eds),
has been conducted on some of these areas, such as Cognitive Aspects of Survey Methodology: Building a
Bridge between Disciplines. Washington, DC: National
activities of daily living questions [19, 20], and Academy Press, 1984; 73–101.
attitude questions [34], and the techniques appear 9. Willis GB, Royston P, Bercini D. The use of verbal report
to apply to opinion questions as well as behavioral methods in the development and the testing of survey
memory questions. questionnaires. Appl Cognit Psychol 1991; 5: 251–267.
The quality of data from questionnaires can be 10. Esposito JL, Jobe JB. A general model of the survey in-
teraction process. Bureau of the Census Seventh Annual
improved by not only the non-experimental and Research Conference Proceedings 1991; 537–560.
experimental methods described here, but also by 11. Forsyth BH, Lessler JT. Cognitive laboratory methods: a
use of formatting and design principles for self- taxonomy. In: Biemer PP, Groves RM, Lyberg LE, Ma-
administered questionnaires [35]. An excellent thiowetz NA, Sudman S (eds), Measurement Errors in
discussion of formatting Health Related Quality of Surveys. New York: Wiley, 1991; 167–183.
12. Lessler JT, Forsyth BH. A coding system for appraising
Life Instruments is provided by Mullin et al. [36]. questionnaires. In: Schwarz N, Sudman S (eds), Answering
The combination of cognitive, formatting, and Questions: Methodology for Determining Cognitive and
psychometric approaches can be effectively used to Communicative Processes in Survey Research. San Fran-
improve the accuracy of data from qualify of life cisco: Jossey-Bass, 1996; 259–291.
self-report instruments in clinical trials, epidemi- 13. Lee L, Brittingham A, Tourangeau R, et al. Are reporting
errors due to encoding limitations or retrieval failure?
ologic studies, and health surveys. Surveys of child vaccination as a case study. Appl Cognit
Psychol 1999; 13: 43–63.
14. Smith AF. Cognitive processes in long-term dietary recall.
Acknowledgments Vital Health Stat 6. 1991; 4: 1–34.
15. Bercini DH. Pretesting questionnaires in the laboratory:
The author is greatly indebted to Ivan Barofsky, an alternative approach. J Expo Anal Environ Epidemiol
James L. Esposito and Douglas J. Herrmann for 1992; 2: 241–248.
their comments and suggestions on an earlier 16. Pratt WF, Tourangeau R, Jobe JB, et al. Asking sensitive
questions in a health survey. Vital Health Stat 6 (in press).
version of this article. Any remaining errors are 17. Sudman S, Warnecke R, Johnson T, et al. Cognitive as-
the sole responsibility of the author. pects of reporting cancer prevention examinations and
tests. Vital Health Stat 6. 1994; 7: 1–171.
18. Willis GB. The use of the psychological laboratory to
References study sensitive survey topics. In: Harrison L, Hughes A
(eds), The Validity of Self-reported Drug Use: Improving
1. Stone AA, Turkkan JS, Bachrach CA, Jobe JB, Kurtzman the Accuracy of Survey Estimates. Rockville, MD: Na-
HS, Cain VS (eds). The Science of Self-report: Implications tional Institute on Drug Abuse, 1997; 416–438.
for Research and Practice. Mahwah, NJ: Erlbaum, 2000. 19. Jobe JB, Mingay DJ. Cognitive laboratory approach to
2. Rubin DC (ed). Autobiographical Memory. Cambridge, designing questionnaires for surveys of the elderly. Public
UK: Cambridge University Press, 1986. Health Rep 1990; 105: 518–524.
227
20. Keller DM, Kovar MG, Jobe JB, Branch LG. Problems 29. Wagenaar WA. My memory: A study of autobiographical
eliciting elders’ reports of functional status. J Aging Health memory over six years. Cognit Psychol 1986; 18: 225–
1993; 5: 306–318. 252.
21. Schechter S, Herrmann D. The proper use of self-report 30. Drury CG, Paramore B, Van Gott HP, et al. Task analysis.
questions in effective measurement of health outcomes. In: Salvendy G (ed), Handbook of Human Factors. New
Eval Health Prof 1997; 20: 28–46. York: Wiley, 1987; 370–401.
22. Subar AF, Thompson FE, Smith AF, et al. Improving 31. Krueger RA. Focus Groups: A Practical Guide for Ap-
food frequency questionnaires: A qualitative approach plied Research. Thousand Oaks, CA: Sage, 1994.
using cognitive interviews. J Am Diet Assoc 1995; 95: 781– 32. Jobe JB, Mingay DJ. Cognitive research improves ques-
788. tionnaires. Am J Public Health 1989; 79: 1053–1055.
23. Smith AF, Jobe JB, Mingay DJ. Question-induced cogni- 33. Means B, Loftus EF. When personal history repeats itself:
tive biases in reports of dietary intake by college men and Decomposing memories for recurring events. Appl Cognit
women. Health Psychol 1991; 10: 244–251. Psychol 1991; 5: 297–318.
24. Smith AF, Jobe JB, Mingay DJ. Retrieval from memory of 34. Tourangeau R, Rasinski KA. Cognitive processes under-
dietary information. Appl Cognit Psychol 1991; 5: 269– lying context effects in attitude measurement. Psychol Bull
296. 1989; 103: 299–314.
25. Means B, Swan GE, Jobe JB, Esposito JL. An alternative 35. Jenkins CR, Dillman DA. Towards a theory of self-ad-
approach to obtaining personal history data. In: Biemer ministered questionnaire design. In: Lyberg L, Biemer P,
PP, Groves RM, Lyberg LE, Mathiowetz NA, Sudman S Collins M, de Leeuw E, Dippo C, Schwarz N, Trewin D
(eds), Measurement Errors in Surveys. New York: Wiley, (eds), Survey Measurement and Process Quality. New
1991; 167–183. York: Wiley, 1997; 165–196.
26. Thompson FE, Subar AF, Brown CC, et al. Cognitive 36. Mullen PA, Lohr KN, Bresnahan BW, McNulty P. Ap-
research enhances accuracy of food frequency question- plying cognitive design principles to formatting HRQOL
naire reports: Results of an experimental validation study. instruments. Qual Life Res 2000; 9: 13–27.
J Am Diet Assoc 2002; 102: 212–218, 223–225.
27. Jobe JB, White AA, Kelley CL, et al. Recall strategies and
memory for health care visits. Milbank Q 1991; 68: 171– Address for correspondence: Jared B. Jobe, Behavioral Medicine
189. Scientific Research Group, National Heart, Lung, and Blood
28. Brewer WF. Autobiographical memory and survey re- Institute, 6701 Rockledge Drive, MSC 7936, Bethesda MD
search. In: Schwarz N, Sudman S (eds), Autobiographical 20892-7936, USA
Memory and the Validity of Retrospective Reports. New Phone: +1-301-435-0407; Fax: +1-301-480-1773
York: Springer-Verlag, 1994; 11–20. E-mail: [email protected]