0% found this document useful (0 votes)
5 views17 pages

Psychological Testing An Introduction 2nd Edition ISBN 0521861810, 9780521861816 Full Book Access

The book 'Psychological Testing: An Introduction, 2nd Edition' provides a comprehensive overview of psychological testing, covering basic concepts, test construction, reliability, and validity. It also explores various dimensions of testing, including personality, cognition, and psychopathology, as well as applications in different settings such as schools and clinical environments. The text aims to integrate classical approaches with newer perspectives, making it accessible and relevant for both students and practitioners in the field.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views17 pages

Psychological Testing An Introduction 2nd Edition ISBN 0521861810, 9780521861816 Full Book Access

The book 'Psychological Testing: An Introduction, 2nd Edition' provides a comprehensive overview of psychological testing, covering basic concepts, test construction, reliability, and validity. It also explores various dimensions of testing, including personality, cognition, and psychopathology, as well as applications in different settings such as schools and clinical environments. The text aims to integrate classical approaches with newer perspectives, making it accessible and relevant for both students and practitioners in the field.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Psychological Testing An Introduction 2nd Edition

Visit the link below to download the full version of this book:

https://medidownload.com/product/psychological-testing-an-introduction-2nd-editi
on/

Click Download Now


cambridge university press
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo

Cambridge University Press


The Edinburgh Building, Cambridge cb2 2ru, UK
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521861816

© Cambridge University Press 2006

This publication is in copyright. Subject to statutory exception and to the provision of


relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.

First published in print format 2006

isbn-13 978-0-511-22012-8 eBook (EBL)


isbn-10 0-511-22012-x eBook (EBL)

isbn-13 978-0-521-86181-6 hardback


isbn-10 0-521-86181-0 hardback

Cambridge University Press has no responsibility for the persistence or accuracy of urls
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
P1: JZP
0521861810pre CB1038/Domino 0 521 86181 0 March 6, 2006 13:31

Contents

Preface page ix
Acknowledgments xi

PART ONE. BASIC ISSUES

1 The Nature of Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


Aim, 1 r Introduction, 1 r Categories of Tests, 5 r Ethical Standards,
9 r Information about Tests, 11 r Summary, 12 r Suggested Readings,
14 r Discussion Questions, 14
2 Test Construction, Administration, and Interpretation . . . . . . . . . . 15
Aim, 15 r Constructing a Test, 15 r Test Items, 18 r Philosophical
Issues, 22 r Administering a Test, 25 r Interpreting Test Scores, 25 r
Item Characteristics, 28 r Norms, 34 r Combining Test Scores, 38 r
Summary, 40 r Suggested Readings, 41 r Discussion Questions, 41
3 Reliability and Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Aim, 42 r Introduction, 42 r Reliability, 42 r Types of Reliability, 43 r
Validity, 52 r Aspects of Validity, 57 r Summary, 65 r Suggested
Readings, 66 r Discussion Questions, 66

PART TWO. DIMENSIONS OF TESTING

4 Personality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Aim, 67 r Introduction, 67 r Some Basic Issues, 68 r Types of
Personality Tests, 70 r Examples of Specific Tests, 72 r The Big Five, 88 r
Summary, 91 r Suggested Readings, 91 r Discussion Questions, 91
5 Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Aim, 92 r Introduction, 92 r Theories of Intelligence, 94 r Other
Aspects, 97 r The Binet Tests, 100 r The Wechsler Tests, 105 r Other
Tests, 116 r Summary, 125 r Suggested Readings, 126 r Discussion
Questions, 126
6 Attitudes, Values, and Interests . . . . . . . . . . . . . . . . . . . . . . . . . 127
Aim, 127 r Attitudes, 127 r Values, 141 r Interests, 148 r Summary,
160 r Suggested Readings, 160 r Discussion Questions, 160

v
P1: JZP
0521861810pre CB1038/Domino 0 521 86181 0 March 6, 2006 13:31

vi Contents

7 Psychopathology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Aim, 161 r Introduction, 161 r Measures, 163 r The Minnesota
Multiphasic Personality Inventory (MMPI) and MMPI-2, 170 r The
Millon Clinical Multiaxial Inventory (MCMI), 179 r Other Measures,
185 r Summary, 196 r Suggested Readings, 196 r Discussion
Questions, 196
8 Normal Positive Functioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Aim, 197 r Self-Concept, 197 r Locus of Control, 202 r Sexuality,
204 r Creativity, 205 r Imagery, 213 r Competitiveness, 215 r
Hope, 216 r Hassles, 218 r Loneliness, 218 r Death Anxiety, 219 r
Summary, 220 r Suggested Readings, 220 r Discussion Questions, 221

PART THREE. APPLICATIONS OF TESTING


9 Special Children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Aim, 223 r Some Issues Regarding Testing, 223 r Categories of Special
Children, 234 r Some General Issues About Tests, 246 r Summary,
255 r Suggested Readings, 255 r Discussion Questions, 256
10 Older Persons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Aim, 257 r Some Overall Issues, 257 r Attitudes Toward the Elderly,
260 r Anxiety About Aging, 261 r Life Satisfaction, 261 r Marital
Satisfaction, 263 r Morale, 264 r Coping or Adaptation, 265 r Death
and Dying, 265 r Neuropsychological Assessment, 266 r Depression,
269 r Summary, 270 r Suggested Readings, 270 r Discussion
Questions, 271
11 Testing in a Cross-Cultural Context . . . . . . . . . . . . . . . . . . . . . . . 272
Aim, 272 r Introduction, 272 r Measurement Bias, 272 r
Cross-Cultural Assessment, 282 r Measurement of Acculturation, 284 r
Some Culture-Fair Tests and Findings, 287 r Standardized Tests, 293 r
Summary, 295 r Suggested Readings, 295 r Discussion Questions, 296
12 Disability and Rehabilitation . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Aim, 297 r Some General Concerns, 297 r Modified Testing, 300 r
Some General Results, 301 r Legal Issues, 304 r The Visually Impaired,
307 r Hearing Impaired, 312 r Physical-Motor Disabilities, 321 r
Summary, 323 r Suggested Readings, 323 r Discussion Questions, 324

PART FOUR. THE SETTINGS

13 Testing in the Schools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325


Aim, 325 r Preschool Assessment, 325 r Assessment in the Primary
Grades, 328 r High School, 331 r Admission into College, 334 r The
Graduate Record Examination, 342 r Entrance into Professional
Training, 348 r Tests for Licensure and Certification, 352 r Summary,
354 r Suggested Readings, 355 r Discussion Questions, 355
14 Occupational Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Aim, 356 r Some Basic Issues, 356 r Some Basic Findings, 356 r
Ratings, 359 r The Role of Personality, 360 r Biographical Data
(Biodata), 363 r Assessment Centers, 365 r Illustrative Industrial
Concerns, 371 r Testing in the Military, 373 r Prediction of Police
P1: JZP
0521861810pre CB1038/Domino 0 521 86181 0 March 6, 2006 13:31

Contents vii

Performance, 376 r Examples of Specific Tests, 377 r Integrity Tests,


379 r Summary, 384 r Suggested Readings, 388 r Discussion
Questions, 389
15 Clinical and Forensic Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 390
Aim, 390 r Clinical Psychology: Neuropsychological Testing, 390 r
Projective Techniques, 392 r Some Clinical Issues and Syndromes, 406 r
Health Psychology, 409 r Forensic Psychology, 419 r Legal Standards,
422 r Legal Cases, 422 r Summary, 426 r Suggested Readings, 426 r
Discussion Questions, 426

PART FIVE. CHALLENGES TO TESTING

16 The Issue of Faking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427


Aim, 427 r Some Basic Issues, 427 r Some Psychometric Issues, 432 r
Techniques to Discourage Faking, 434 r Related Issues, 435 r The
MMPI and Faking, 437 r The CPI and Faking, 443 r Social Desirability
and Assessment Issues, 444 r Acquiescence, 448 r Other Issues, 449 r
Test Anxiety, 456 r Testwiseness, 457 r Summary, 458 r Suggested
Readings, 458 r Discussion Questions, 459
17 The Role of Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
Aim, 460 r Historical Perspective, 460 r Computer Scoring of Tests,
461 r Computer Administration of Tests, 462 r Computer-Based Test
Interpretations (CBTI), 467 r Some Specific Tests, 471 r Adaptive
Testing and Computers, 473 r Ethical Issues Involving Computer Use,
476 r Other Issues and Computer Use, 477 r A Look at Other Tests and
Computer Use, 478 r The Future of Computerized Psychological
Testing, 481 r Summary, 481 r Suggested Readings, 482 r Discussion
Questions, 482
18 Testing Behavior and Environments . . . . . . . . . . . . . . . . . . . . . . 483
Aim, 483 r Traditional Assessment, 483 r Behavioral Assessment, 484 r
Traditional vs. Behavioral Assessment, 488 r Validity of Behavioral
Assessment, 488 r Behavioral Checklists, 490 r Behavioral
Questionnaires, 492 r Program Evaluation, 501 r Assessment of
Environments, 502 r Assessment of Family Functioning, 506 r
Broad-Based Instruments, 510 r Summary, 515 r Suggested Readings,
515 r Discussion Questions, 516
19 The History of Psychological Testing . . . . . . . . . . . . . . . . . . . . . . 517
Aim, 517 r Introduction, 517 r The French Clinical Tradition, 518 r
The German Nomothetic Approach, 519 r The British Idiographic
Approach, 520 r The American Applied Orientation, 522 r Some
Recent Developments, 530 r Summary, 533 r Suggested Readings,
533 r Discussion Questions, 533
Appendix: Table to Translate Difficulty Level of a Test Item into
a z Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
References 537
Test Index 623
Index of Acronyms 627
Subject Index 629
P1: JZP
0521861810pre CB1038/Domino 0 521 86181 0 March 6, 2006 13:31

viii
P1: JZP
0521861810pre1 CB1038/Domino 0 521 86181 0 March 6, 2006 13:32

Preface

My first professional publication in 1963 was as important, and that is what we have tried to
a graduate student (with Harrison Gough) on emphasize.
a validational study of a culture-fair test. Since Because of my varied experience in indus-
then, I have taught a course on psychological test- try, in a counseling center, and other service-
ing with fair regularity. At the same time, I have oriented settings, and also because as a clini-
steadfastly refused to specialize and have had the cally trained academic psychologist I have done a
opportunity to publish in several different areas, considerable amount of research, I have tried to
to work in management consulting, to be director cover both sides of the coin – the basic research-
of a counseling center and of a clinical psychology oriented issues and the application of tests in
program, to establish an undergraduate honors service-oriented settings. Thus Parts One and
program, and to be involved in a wide variety of Two, the first eight chapters, serve as an introduc-
projects with students in nursing, rehabilitation, tion to basic concepts, issues, and approaches.
education, social work, and other fields. In all of Parts Three and Four, Chapters 9 through 15,
these activities, I have found psychological test- have a much more applied focus. Finally, we have
ing to be central and to be very challenging and attempted to integrate both classical approaches
exciting. and newer thinking about psychological testing.
In this book, we have tried to convey the excite- The area of psychological testing is fairly well
ment associated with psychological testing and to defined. I cannot imagine a textbook that does
teach basic principles through the use of con- not discuss such topics as reliability, validity,
crete examples. When specific tests are men- and norms. Thus, what distinguishes one text-
tioned, they are mentioned because they are used book from another is not so much its content
as an example to teach important basic princi- but more a question of balance. For example,
ples, or in some instances, because they occupy a most textbooks continue to devote one or more
central/historical position. No attempt has been chapters to projective techniques, even though
made to be exhaustive. their use and importance has decreased substan-
Much of what is contained in many testing tially. Projective techniques are important, not
textbooks is rather esoteric information, of use only from a historical perspective, but also for
only to very few readers. For example, most what they can teach us about basic issues in test-
textbooks include several formulas to compute ing. In this text, they are discussed and illustrated,
interitem consistency. It has been our experi- but as part of a chapter (see Chapter 15) within
ence, however, that 99% of the students who the broader context of testing in clinical settings.
take a course on testing will never have occa- Most textbooks also have several chapters on
sion to use such formulas, even if they enter a intelligence testing, often devoting considerable
career in psychology or allied fields. The very few space to such topics as the heritability of intelli-
who might need to do such calculations will do gence, theories of trait organization, longitudinal
them by computer or will know where to find studies of intelligence, and similar topics. Such
the relevant formulas. It is the principle that is topics are of course important and fascinating,
ix
P1: JZP
0521861810pre1 CB1038/Domino 0 521 86181 0 March 6, 2006 13:32

x Preface

but do they really belong in a textbook on psy- this topic. Most textbooks begin with a historical
chological testing? If they do, then that means chapter. We have chosen to place this chapter last,
that some other topics more directly relevant to so the reader can better appreciate the historical
testing are omitted or given short shrift. In this background from a more knowledgeable point of
textbook, we have chosen to focus on testing and view.
to minimize the theoretical issues associated with Finally, rather than writing a textbook about
intelligence, personality, etc., except where they testing, we have attempted to write a textbook
may be needed to have a better understanding of about testing the individual. We believe that
testing approaches. most testing applications involve an attempt
It is no surprise that computers have had (and to use tests as a tool to better understand an
continue to have) a major impact on psycholog- individual, whether that person is a client in
ical testing, and so an entire chapter of this book therapy, a college student seeking career or
(Chapter 17) is devoted to this topic. There is academic guidance, a business executive wish-
also a vast body of literature and great student ing to capitalize on strengths and improve
interest on the topic of faking, and here too an on weaknesses, or a volunteer in a scientific
entire chapter (Chapter 16) has been devoted to experiment.
P1: JZP
0521861810pre1 CB1038/Domino 0 521 86181 0 March 6, 2006 13:32

Acknowledgments

In my career as a psychologist, I have had the brief, but his writings greatly influenced my own
excellent fortune to be mentored, directly and thinking.
indirectly, by three giants in the psychologi- On a personal note, I thank Valerie, my wife
cal testing field. The first is Harrison Gough, of 40 years, for her love and support, and for
my mentor in graduate school at Berkeley, who being the best companion one could hope for
showed me how useful and exciting psychological in this voyage we call life. Our three children
tests can be when applied to real-life problems. have been an enormous source of love and pride:
More importantly, Gough has continued to be Brian, currently a professor of philosophy at
not only a mentor but also a genuine model to be Miami University of Ohio; Marisa, a professor
emulated both as a psychologist and as a human of health economics at the University of North
being. Much of my thinking and approach to test- Carolina, Chapel Hill; and Marla, chief foren-
ing, as well as my major interest in students at all sic psychologist in the Department of Mental
levels, is a direct reflection of Gough’s influence. Health of South Carolina, and co-author of this
The second was Anne Anastasi, a treasured col- edition. Zeno and Paolo, our two grandchildren,
league at Fordham University, a generous friend, are unbelievably smart, handsome, and adorable
and the best chairperson I have ever worked with. and make grandparenting a joy. I have also been
Her textbook has been truly a model of schol- truly blessed with exceptional friends whose love
arship and concise writing, the product of an and caring have enriched my life enormously.
extremely keen mind who advanced the field of
George Domino
psychological testing in many ways.
Tucson, AZ
The third person was Lee J. Cronbach of Stan-
ford University. My first undergraduate exposure
to testing was through his textbook. In 1975, An abundance of gratitude to my father for giv-
Cronbach wrote what is now a classic paper titled, ing me the opportunity to collaborate with one
“Beyond the two disciplines of scientific psychol- of the greatest psychologists ever known. And
ogy” (American Psychologist, 1975, vol. 30, pp. an immeasurable amount of love and respect
116–127), in which he argued that experimen- to my heroes – my Dad and Mom. I would
tal psychology and the study of individual differ- also like to thank my mentor and friend, Stan
ences should be integrated. In that paper, Cron- Brodsky, whose professional accomplishments
bach was kind enough to cite at some length two are only surpassed by his warmth, kindness, and
of my studies on college success as examples of generous soul.
this integration. Subsequently I was able to invite
him to give a colloquium at the University of Marla Domino
Arizona. My contacts with him were regrettably Columbia, SC

xi
P1: JZP
0521861810pre1 CB1038/Domino 0 521 86181 0 March 6, 2006 13:32

xii
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

PART ONE: BASIC ISSUES

1 The Nature of Tests

AIM In this chapter we cover four basic issues. First, we focus on what is a test, not just
a formal definition, but on ways of thinking about tests. Second, we try to develop a
“taxonomy” of tests, that is we look at various ways in which tests can be categorized.
Third, we look at the ethical aspects of psychological testing. Finally, we explore how
we can obtain information about a specific test.

INTRODUCTION of experiment, the experimenter studies a phe-


nomenon and observes the results, while at the
Most likely you would have no difficulty identi-
same time keeping in check all extraneous vari-
fying a psychological test, even if you met one in
ables so that the results can be ascribed to a par-
a dark alley. So the intent here is not to give you
ticular antecedent cause. In psychological testing,
one more definition to memorize and repeat but
however, it is usually not possible to control all
rather to spark your thinking.
the extraneous variables, but the metaphor here
is a useful one that forces us to focus on the stan-
What is a test? Anastasi (1988), one of the dardized procedures, on the elimination of con-
best known psychologists in the field of testing, flicting causes, on experimental control, and on
defined a test as an “objective” and “standard- the generation of hypotheses that can be further
ized” measure of a sample of behavior. This is investigated. So if I administer a test of achieve-
an excellent definition that focuses our attention ment to little Sandra, I want to make sure that
on three elements: (1) objectivity: that is, at least her score reflects what she has achieved, rather
theoretically, most aspects of a test, such as how than her ability to follow instructions, her degree
the test is scored and how the score is interpreted, of hunger before lunch, her uneasiness at being
are not a function of the subjective decision of a tested, or some other influence.
particular examiner but are based on objective A second way to consider a test is to think of a
criteria; (2) standardization: that is, no matter test as an interview. When you are administered
who administers, scores, and interprets the test, an examination in your class, you are essentially
there is uniformity of procedure; and (3) a sample being interviewed by the instructor to determine
of behavior: a test is not a psychological X-ray, nor how well you know the material. We discuss inter-
does it necessarily reveal hidden conflicts and for- views in Chapter 18, but for now consider the
bidden wishes; it is a sample of a person’s behav- following: in most situations we need to “talk”
ior, hopefully a representative sample from which to each other. If I am the instructor, I need to
we can draw some inferences and hypotheses. know how much you have learned. If I am hiring
There are three other ways to consider psycho- an architect to design a house or a contractor to
logical tests that we find useful and we hope you build one, I need to evaluate their competency,
will also. One way is to consider the administra- and so on. Thus “interviews” are necessary, but
tion of a test as an experiment. In the classical type a test offers many advantages over the standard
1
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

2 Part One. Basic Issues

interview. With a test I can “interview” 50 or simply a set of printed items requiring some type
5,000 persons at one sitting. With a test I can be of written response.
much more objective in my evaluation because
for example, multiple-choice answer sheets do Testing vs. assessment. Psychological assessment
not discriminate on the basis of gender, ethnic- is basically a judgmental process whereby a broad
ity, or religion. range of information, often including the results
A third way to consider tests is as tools. Many of psychological tests, is integrated into a mean-
fields of endeavor have specific tools – for exam- ingful understanding of a particular person. If
ple, physicians have scalpels and X-rays, chemists that person is a client or patient in a psychother-
have Bunsen burners and retorts. Just because apeutic setting, we call the process clinical assess-
someone can wield a scalpel or light up a Bunsen ment. Psychological testing is thus a narrower
burner does not make him or her an “expert” in concept referring to the psychometric aspects
that field. The best use of a tool is in the hands of of a test (the technical information about the
a trained professional when it is simply an aid to test), the actual administration and scoring of the
achieve a particular goal. Tests, however, are not test, and the interpretation made of the scores.
just psychological tools; they also have political We could of course assess a client simply by
and social repercussions. For example, the well- administering a test or battery (group) of tests.
publicized decline in SAT scores (Wirtz & Howe, Usually the assessing psychologist also inter-
1977) has been used as an indicator of the terri- views the client, obtains background informa-
ble shape our educational system is in (National tion, and where appropriate and feasible, infor-
Commission, 1983). mation from others about the client [see Korchin,
1976, for an excellent discussion of clinical assess-
A test by any other name. . . . In this book, we ment, and G. J. Meyer, Finn, Eyde, et al. (2001)
use the term psychological test (or more briefly for a brief overview of assessment].
test) to cover those measuring devices, tech-
niques, procedures, examinations, etc., that in Purposes of tests. Tests are used for a wide vari-
some way assess variables relevant to psycholog- ety of purposes that can be subsumed under more
ical functioning. Some of these variables, such as general categories. Many authors identify four
intelligence, introversion-extraversion, and self- categories typically labeled as: classification, self-
esteem are clearly “psychological” in nature. Oth- understanding, program evaluation, and scientific
ers, such as heart rate or the amount of pal- inquiry.
mar perspiration (the galvanic skin response), Classification involves a decision that a par-
are more physiological but are related to psy- ticular person belongs in a certain category. For
chological functioning. Still other variables, such example, based on test results we may assign a
as socialization, delinquency, or leadership, may diagnosis to a patient, place a student in the intro-
be somewhat more “sociological” in nature, but ductory Spanish course rather than the interme-
are of substantial interest to most social and diate or advanced course, or certify that a person
behavioral scientists. Other variables, such as has met the minimal qualifications to practice
academic achievement, might be more relevant medicine.
to educators or professionals working in edu- Self-understanding involves using test infor-
cational settings. The point here is that we mation as a source of information about oneself.
use the term psychological in a rather broad Such information may already be available to the
sense. individual, but not in a formal way. Marlene, for
Psychological tests can take a variety of forms. example, is applying to graduate studies in elec-
Some are true-false inventories, others are rat- trical engineering; her high GRE scores confirm
ing scales, some are actual tests, whereas others what she already knows, that she has the potential
are questionnaires. Some tests consist of mate- abilities required for graduate work.
rials such as inkblots or pictures to which the Program evaluation involves the use of tests
subject responds verbally; still others consist of to assess the effectiveness of a particular pro-
items such as blocks or pieces of a puzzle that the gram or course of action. You have probably seen
subject manipulates. A large number of tests are in the newspaper, tables indicating the average
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

The Nature of Tests 3

achievement test scores for various schools in Rorschach Inkblot test. Subsequently they were
your geographical area, with the scores often tested with the Rorschach and the responses
taken, perhaps incorrectly, as evidence of the clearly showed a suggestive influence because of
competency level of a particular school. Pro- the prior readings. Ironson and Davis (1979)
gram evaluation may involve the assessment of administered a test of creativity three times, with
the campus climate at a particular college, or instructions to “fake creative,” “fake uncreative,”
the value of a drug abuse program offered by a or “be honest”; the obtained scores reflected the
mental health clinic, or the effectiveness of a new influence of the instructions. On the other hand,
medication. Sattler and Theye (1967) indicated that of twelve
Tests are also used in scientific inquiry. If you studies reviewed, which departed from standard
glance through most professional journals in the administrative procedures, only five reported sig-
social and behavioral sciences, you will find that nificant differences between standard and non-
a large majority of studies use psychological tests standard administration.
to operationally define relevant variables and to 2. Situational variables. These include a vari-
translate hypotheses into numerical statements ety of aspects that presumably can alter the test
that can be assessed statistically. Some argue that situation significantly, such as a subject feeling
development of a field of science is, in large part, frustrated, discouraged, hungry, being under the
a function of the available measurement tech- influence of drugs, and so on. Some of these vari-
niques (Cone & Foster, 1991; Meehl, 1978). ables can have significant effects on test scores,
but the effects are not necessarily the same for all
Tests as experimental procedure. If we accept subjects. For example, Sattler and Theye (1967)
the analogy that administering a test is very report that discouragement affects the perfor-
much like an experiment, then we need to make mance of children but not of college students on
sure that the experimental procedure is followed some intelligence tests.
carefully and that extraneous variables are not
3. Experimenter variables. The testing situation
allowed to influence the results. This means, for
is a social situation, and even when the test is
example, that instructions and time limits need
administered by computer, there is clearly an
to be adhered to strictly. The greater the control
experimenter, a person in charge. That person
that can be exercised on all aspects of a test situ-
may exhibit characteristics (such as age, gender,
ation, the lesser the influence of extraneous vari-
and skin color) that differ from those of the sub-
ables. Thus the scoring of a multiple-choice exam
ject. The person may appear more or less sym-
is less influenced by such variables as clarity of
pathetic, warm or cold, more or less authoritar-
handwriting than the scoring of an essay exam; a
ian, aloof, more adept at establishing rapport,
true-false personality inventory with simple
etc. These aspects may or may not affect the sub-
instructions is probably less influenced than an
ject’s test performance; the results of the avail-
intelligence test with detailed instructions.
able experimental evidence are quite complex
Masling (1960) reviewed a variety of studies
and not easily summarized. We can agree with
of variables that can influence a testing situation,
Sattler and Theye (1967), who concluded that the
in this case “projective” testing (see Chapter 15);
experimenter-subject relationship is important
Sattler and Theye (1967) did the same for intel-
and that (perhaps) less qualified experimenters
ligence tests. We can identify, as Masling (1960)
do not obtain appreciably different results than
did, four categories of such variables:
more qualified experimenters. Whether the race,
1. The method of administration. Standard ethnicity, physical characteristics, etc., of the
administration can be altered by disregarding or experimenter significantly affect the testing situ-
changing instructions, by explicitly or implic- ation seems to depend on a lot of other variables
itly giving the subject a set to answer in a cer- and, in general, do not seem to be as powerful an
tain way, or by not following standard proce- influence as many might think.
dures. For example, Coffin (1941) had subjects 4. Subject variables. Do aspects of the subject,
read fictitious magazine articles indicating what such as level of anxiety, physical attractiveness,
were more socially acceptable responses to the etc., affect the testing situation? Masling (1960)
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

4 Part One. Basic Issues

used attractive female accomplices who, as test is rare for such decisions to be based solely on
subjects, acted “warm” or “cold” toward the test data. Yet in many situations, test data rep-
examiners (graduate students). The test results resent the only source of objective data standard
were interpreted by the graduate students more for all candidates; other sources of data such as
favorably when the subject acted warm than interviews, grades, and letters of recommenda-
when she acted cold. tion are all “variable” – grades from different
schools or different instructors are not compara-
In general what can we conclude? Aside from
ble, nor are letters written by different evaluators.
the fact that most studies in this area seem to
Finally, as scientists, we should ask what is the
have major design flaws and that many specific
empirical evidence for the accuracy of predicting
variables have not been explored consistently,
future behavior. That is, if we are admitting col-
Masling (1960) concluded that there is strong evi-
lege students to a particular institution, which
dence of situational and interpersonal influences
sources of data, singly or in combination, such
in projective testing, while Sattler and Theye
as interviewers’ opinions, test scores, high school
(1967) concluded that:
GPA, etc., would be most accurate in making rel-
1. Departures from standard procedures are evant predictions, such as, “Let’s admit Marlene
more likely to affect “specialized” groups, such because she will do quite well academically.” We
as children, schizophrenics, and juvenile delin- will return to this issue, but for now let me indi-
quents than “normal” groups such as college cate a general psychological principle that past
students; behavior is the best predictor of future behav-
2. Children seem to be more susceptible to situ- ior, and a corollary that the results of psycholog-
ational factors, especially discouragement, than ical tests can provide very useful information on
are college-aged adults; which to make more accurate future predictions.
3. Rapport seems to be a crucial variable, while
degree of experience of the examiner is not; Relation of test content to predicted behavior.
Rebecca is enrolled in an introductory Spanish
4. Racial differences, specifically a white exam-
course and is given a Spanish vocabulary test
iner and a black subject, may be important, but
by the instructor. Is the instructor interested in
the evidence is not definitive.
whether Rebecca knows the meaning of the spe-
cific words on the test? Yes indeed, because the
Tests in decision making. In the real world, deci- test is designed to assess Rebecca’s mastery of
sions need to be made. To allow every person the vocabulary covered in class and in homework
who applies to medical school to be admitted assignments. Consider now a test such as the SAT,
would not only create huge logistical problems, given for college admission purposes. The test
but would result in chaos and in a situation may contain a vocabulary section, but the
that would be unfair to the candidates them- concern is not whether an individual knows
selves, some of whom would not have the intel- the particular words; knowledge of this sample
lectual and other competencies required to be of words is related to something else, namely
physicians, to the medical school faculty whose doing well academically in college. Finally, con-
teaching efforts would be diluted by the pres- sider a third test, the XYZ scale of depression.
ence of unqualified candidates, and eventually to Although the scale contains no items about sui-
the public who might be faced with incompetent cide ideation, it has been discovered empirically
physicians. that high scorers on this scale are likely to attempt
Given that decisions need to be made, we suicide. These three examples illustrate an impor-
must ask what role psychological tests can play in tant point: In psychological tests, the content of
such decision making. Most psychologists agree the test items may or may not cover the behav-
that major decisions should not be based on ior that is of interest – there may be a lack of
the results of a single test administration, that correspondence between test items and the pre-
whether or not state university admits Sandra dicted behavior. But a test can be quite useful if
should not be based solely on her SAT scores. an empirical correspondence between test scores
In fact, despite a stereotype to the contrary, it and real-life behavior can be shown.
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

The Nature of Tests 5

CATEGORIES OF TESTS published tests, the MMY will provide a brief


description of the test (its purpose, applicable age
Because there are thousands of tests, it would be
range, type of score generated, price, administra-
helpful to be able to classify tests into categories,
tion time, and name and address of publisher), a
just as a bookstore might list its books under dif-
bibliography of citations relevant to the test, and
ferent headings. Because tests differ from each
one or more reviews of the test by test experts.
other in a variety of ways, there is no uniformly
Tests that are reviewed in one edition of the MMY
accepted system of classification. Therefore, we
may or may not be reviewed in subsequent edi-
will invent our own based on a series of questions
tions, so locating information about a specific
that can be asked of any test. I should point out
test may involve browsing through a number of
that despite a variety of advances in both theory
editions. MMY reviews of specific tests are also
and technique, standardized tests have changed
available through a computer service called the
relatively little over the years (Linn, 1986), so
Bibliographic Retrieval Services.
while new tests are continually published, a classi-
If the test you are interested in learning about is
ficatory system should be fairly stable, i.e., appli-
not commercially published, it will probably have
cable today as well as 20 years from now.
an author(s) who published an article about the
test in a professional journal. The journal arti-
Commercially published? The first question is cle will most likely give the author’s address at
whether a test is commercially published (some- the time of publication. If you are a “legitimate”
times called a proprietary test) or not. Major test user, for example a graduate student doing a
tests like the Stanford-Binet and the Minnesota doctoral dissertation or a psychologist engaged in
Multiphasic Personality Inventory are available research work, a letter to the author will usually
for purchase by qualified users through commer- result in a reply with a copy of the test and per-
cial companies. The commercial publisher adver- mission to use it. If the author has moved from
tises primarily through its catalog, and for many the original address, you may locate the current
tests makes available, for a fee, a specimen set, usu- address through various directories and “Who’s
ally the test booklet and answer sheet, a scoring Who” type of books, or through computer gen-
key to score the test, and a test manual that con- erated literature searches.
tains information about the test. If a test is not
commercially published, then a copy is ordinarily Administrative aspects. Tests can also be distin-
available from the test author, and there may be guished by various aspects of their administra-
some accompanying information, or perhaps just tion. For example, there are group vs. individual
the journal article where the test was first intro- tests; group tests can be administered to a group
duced. Sometimes journal articles include the of subjects at the same time and individual tests to
original test, particularly if it is quite short, but one person only at one time. The Stanford-Binet
often they will not. (Examples of articles that con- test of intelligence is an individual test, whereas
tain test items are R. L. Baker, Mednick & Hoce- the SAT is a group test. Clinicians who deal with
var, 1991; L. R. Good & K. C. Good, 1974; McLain, one client at a time generally prefer individual
1993; Rehfisch, 1958a; Snell, 1989; Vodanovich tests because these often yield observational data
& Kass, 1990). Keep in mind that the contents of in addition to a test score; researchers often need
journal articles are copyright and permission to to test large groups of subjects in minimum time
use a test must be obtained from both the author and may prefer group tests (there are of course,
and the publisher. many exceptions to this statement). A group test
If you are interested in learning more about can be administered to one individual; some-
a specific test, first you must determine if the times, an individual test can be modified so it
test is commercially published. If it is, then you can be administered to a group.
will want to consult the Mental Measurements Tests can also be classified as speed vs. power
Yearbook (MMY), available in most university tests. Speed tests have a time limit that affects
libraries. Despite its name, the MMY is published performance; for example, you might be given a
at irregular intervals rather than yearly. However, page of printed text and asked to cross out all the
it is an invaluable guide. For many commercially “e’s” in 25 seconds. How many you cross out will
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

6 Part One. Basic Issues

be a function of how fast you respond. A power more,” “how do you feel about that?” and “tell me
test, on the other hand, is designed to measure about yourself.” In between, we have countless
how well you can do and so either may have variations such as matching items (closer to the
no time limit or a time limit of convenience (a objective pole) and essay questions (closer to the
50-minute hour) that ordinarily does not affect subjective pole). Objective items are easy to score
performance. The time limits on speed tests are and to manipulate statistically, but individually
usually set so that only 50% of the applicants are reveal little other than that the person answered
able to attempt every item. Time limits on power correctly or incorrectly. Subjective items are
tests are set so that about 90% of the applicants difficult and sometimes impossible to quantify,
can attempt all items. but can be quite a revealing and rich source of
Another administrative distinction is whether information.
a test is a secure test or not. For example, the SAT Another possible distinction in item struc-
is commercially published but is ordinarily not ture is whether the items are verbal in nature or
made available even to researchers. Many tests require performance. Vocabulary and math items
that are used in industry for personnel selection are labeled verbal because they are composed of
are secure tests whose utility could be compro- verbal elements; building a block tower is a per-
mised if they were made public. Sometimes only formance item.
the scoring key is confidential, rather than the
items themselves. Area of assessment. Tests can also be classified
A final distinction from an administrative according to the area of assessment. For exam-
point of view is how invasive a test is. A ques- ple, there are intelligence tests, personality ques-
tionnaire that asks about one’s sexual behaviors is tionnaires, tests of achievement, career-interest
ordinarily more invasive than a test of arithmetic; tests, tests of reading, tests of neuropsychological
a test completed by the subject is usually more functioning, and so on. The MMY uses 16 such
invasive than a report of an observer, who may categories. These are not necessarily mutually
report the observations without even the subject’s exclusive categories, and many of them can be fur-
awareness. ther subdivided. For example, tests of personality
could be further categorized into introversion-
The medium. Tests differ widely in the materi- extraversion, leadership, masculinity-femininity,
als used, and so we can distinguish tests on this and so on.
basis. Probably, the majority of tests are paper- In this textbook, we look at five major cate-
and-pencil tests that involve some set of printed gories of tests:
questions and require a written response, such as
1. Personality tests, which have played a major
marking a multiple answer sheet. Other tests are
role in the development of psychological testing,
performance tests that perhaps require the manip-
both in its acceptance and criticism. Personality
ulation of wooden blocks or the placement of
represents a major area of human functioning for
puzzle pieces in correct juxtaposition. Still other
social-behavioral scientists and lay persons alike;
tests involve physiological measures such as the
galvanic skin response, the basis of the polygraph 2. Tests of cognitive abilities, not only tradi-
(lie detector) machine. Increasing numbers of tional intelligence tests, but other dimensions
tests are now available for computer administra- of cognitive or intellectual functioning. In some
tion and this may become a popular category. ways, cognitive psychology represents a major
new emphasis in psychology which has had a sig-
Item structure. Another way to classify tests, nificant impact on all aspects of psychology both
which overlaps with the approaches already men- as a science and as an applied field;
tioned, is through their item structure. Test items 3. Tests of attitudes, values, and interests, three
can be placed on a continuum from objective to areas that psychometrically overlap, and also
subjective. At the objective end, we have multiple- offer lots of basic testing lessons;
choice items; at the subjective end, we have the 4. Tests of psychopathology, primarily those used
type of open-ended questions that clinical psy- by clinicians and researchers to study the field of
chologists and psychiatrists ask, such as “tell me mental illness; and
P1: JZP
0521861810c01 CB1038/Domino 0 521 86181 0 February 24, 2006 14:9

The Nature of Tests 7

5. Tests that assess normal and positive func- not to the test but to how the score or perfor-
tioning, such as creativity, competence, and self- mance is interpreted. The same test could yield
esteem. either or both score interpretations.
Another distinction that can be made is
Test function. Tests can also be categorized whether the measurement provided by the test
depending upon their function. Some tests are is normative or ipsative, that is, whether the stan-
used to diagnose present conditions. (Does the dard of comparison reflects the behavior of others
client have a character disorder? Is the client or of the client. Consider a 100-item vocabulary
depressed?) Other tests are used to make pre- test that we administer to Marisa, and she obtains
dictions. (Will this person do well in college? Is a score of 82. To make sense of that score, we
this client likely to attempt suicide?) Other tests compare her score with some normative data –
are used in selection procedures, which basically for example, the average score of similar-aged col-
involve accepting or not accepting a candidate, as lege students. Now consider a questionnaire that
in admission to graduate school. Some tests are asks Marisa to decide which of two values is more
used for placement purposes – candidates who important to her: “Is it more important for you
have been accepted are placed in a particular to have (1) a good paying job, or (2) freedom to
“treatment.” For example, entering students at do what you wish.” We could compare her choice
a university may be placed in different level writ- with that of others, but in effect we have simply
ing courses depending upon their performance asked her to rank two items in terms of her own
in a writing exam. A battery of tests may be used preferences or her own behavior; in most cases it
to make such a placement decision or to assess would not be legitimate to compare her ranking
which of several alternatives is most appropriate with those of others. She may prefer choice num-
for the particular client – here the term typically ber 2, but not by much, whereas for me choice
used is classification (note that this term has both number 2 is a very strong preference.
a broader meaning and a narrower meaning). One way of defining ipsative is that the scores
Some tests are used for screening purposes; the on the scale must sum to a constant. For exam-
term screening implies a rapid and rough proce- ple, if you are presented with a set of six
dure. Some tests are used for certification, usu- ice cream flavors to rank order as to prefer-
ally related to some legal standard; thus passing ence, no matter whether your first preference is
a driving test certifies that the person has, at the “crunchy caramel” or “Bohemian tutti-frutti,”
very least, a minimum proficiency and is allowed the sum of your six preferences will be 21
to drive an automobile. (1+2+3+4+5+6). On the other hand, if you
were asked to rate each flavor independently on
Score interpretation. Yet another classification a 6-point scale, you could rate all of them high or
can be developed on the basis of how scores on all of them low; this would be a normative scale.
a test are interpreted. We can compare the score Another way to define ipsative is to focus on the
that an individual obtains with the scores of a idea that in ipsative measurement, the mean is
group of individuals who also took the same test. that of the individual, whereas in normative mea-
This is called a norm-reference because we refer surement the mean is that of the group. Ipsative
to norms to give a particular score meaning; for measurement is found in personality assessment;
most tests, scores are interpreted in this manner. we look at a technique called Q sort in Chapter 18.
We can also give meaning to a score by compar- Block (1957) found that ipsative and normative
ing that score to a decision rule called a criterion, ratings of personality were quite equivalent.
so this would be a criterion-reference. For exam- Another classificatory approach involves
ple, when you took a driving test (either written whether the responses made to the test are inter-
and/or road), the examiner did not say, “Con- preted psychometrically or impressionistically. If
gratulations your score is two standard devia- the responses are scored and the scores inter-
tions above the mean.” You either passed or failed preted on the basis of available norms and/or
based upon some predetermined criterion that research data, then the process is a psychometric
may or may not have been explicitly stated. Note one. If instead the tester looks at the responses
that norm-reference and criterion-reference refer carefully on the basis of his/her expertise and

You might also like