Mastering Modern Psychological Testing Theory Methods
Mastering Modern Psychological Testing Theory Methods
Psychological Assessment
Why We Do It and What It Is
Why do I need to learn about testing and assessment?
Chapter Outline
Brief History of Testing Participants in the Assessment Process
The Language of Assessment Psychological Assessment in the 21st Century
Assumptions of Psychological Assessment Summary
Why Use Tests?
Common Applications of Psychological
Assessments
Learning Objectives
After reading and studying this chapter, students 5. Describe and explain the assumptions underlying
should be able to: psychological assessment.
1. Describe major milestones in the history of testing. 6. Describe and explain the major applications of
2. Define test, measurement, and assessment. psychological assessments.
3. Describe and give examples of different types of 7. Explain why psychologists use tests.
tests. 8. Describe the major participants in the assessment
4. Describe and give examples of different types of process.
score interpretations. 9. Describe some major trends in assessment.
From Chapter 1 of Mastering Modern Psychological Testing: Theory & Methods, First Edition. Cecil R. Reynolds,
Ronald B. Livingston. Copyright © 2012 by Pearson Education, Inc. All rights reserved.
1
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
Most psychology students are drawn to the study of psychology because they want to
work with and help people, or alternatively to achieve an improved understanding of their own
thoughts, feelings, and behaviors. Many of these students aspire to be psychologists or counselors
and work in clinical settings. Other psychology students are primarily interested in research and
aspire to work at a university or other research institu-
tion. However, only a minority of psychology students
Psychological testing and have a burning desire to specialize in psychological
assessment are important tests and measurement. As a result, we often hear our
in virtually every aspect of students ask, “Why do I have to take a course in tests
professional psychology. and measurement?” This is a reasonable question,
so when we teach test and measurement courses we
spend some time explaining to our students why they
need to learn about testing and assessment. This is one of the major goals of this chapter. Hope-
fully we will convince you this is a worthwhile endeavor.
Psychological testing and assessment are important in virtually every aspect of profession-
al psychology. For those of you interested in clinical, counseling, or school psychology, research
has shown assessment is surpassed only by psychotherapy in terms of professional importance
(e.g., Norcross, Karg, & Prochaska, 1997; Phelps, Eisman, & Kohout, 1998). However, even this
finding does not give full credit to the important role assessment plays in professional practice.
As Meyer et al. (2001) observed, unlike psychotherapy, formal assessment is a unique feature
of the practice of psychology. That is, whereas a number of other mental health professionals
provide psychotherapy (e.g., psychiatrists, counselors, and social workers), psychologists are
the only mental health professionals who routinely conduct formal assessments. Special Interest
Topic 1 provides information about graduate psychology programs that train clinical, counseling,
and school psychologists.
The use of psychological tests and other assessments is not limited to clinical or health
settings. For example:
● Industrial and organizational psychologists devote a considerable amount of their profes-
sional time developing, administering, and interpreting tests. A major aspect of their work
is to develop assessments that will help identify prospective employees who possess the
skills and characteristics necessary to be successful on the job.
● Research psychologists also need to be proficient in measurement and assessment. Most
psychological research, regardless of its focus (e.g., social interactions, child development,
psychopathology, or animal behavior) involves measurement and/or assessment. Whether
a researcher is concerned with behavioral response time, visual acuity, intelligence, or
depressed mood (just to name a few), he or she will need to engage in measurement as
part of the research. In fact, all areas of science are very concerned and dependent on
measurement. Before one can study any variable in any science, it is necessary to be able
to ascertain its existence and measure important characteristics of the variable, construct,
or entity.
● Educational psychologists, who focus on the application of psychology in educational set-
tings, are often intricately involved in testing, measurement, and assessment issues. Their
involvement ranges from developing and analyzing tests that are used in educational set-
tings to educating teachers about how to develop better classroom assessments.
2
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
3
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
In summary, if you are interested in pursuing a career in psychology, it is likely that you
will engage in assessment to some degree, or at least need to understand the outcome of testing
and assessment activities. Before delving into contemporary issues in testing and assessment we
will briefly examine the history of psychological testing.
The earliest documented use of tests is usually attributed to the Chinese who tested public officials
to ensure competence (analogous to contemporary civil service examinations). The Chinese
testing program evolved over the centuries and assumed various forms. For example, during the
Han dynasty written exams were included for the first
time and covered five major areas: agriculture, civil
The earliest documented use law, geography, military affairs, and revenue. During
of tests is usually attributed to the fourth century the program involved three arduous
the Chinese who tested public stages requiring the examinees to spend days isolated
officials to ensure competence. in small booths composing essays and poems (Gregory,
2004).
4
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
CIVIL SERVICE EXAMINATIONS. Civil service tests similar to those used in China to select
government employees were introduced in European countries in the late 18th and early 19th
centuries. In 1883 the U.S. Civil Service Commission started using similar achievement tests to
aid selection of government employees (Anastasi & Urbina, 1997).
PHYSICIANS AND PSYCHIATRISTS. In the 19th century physicians and psychiatrists in England
and the United States developed classification systems to help classify individuals with mental
retardation and other mental problems. For example, in the 1830s the French physician Jean
Esquirol was one of the first to distinguish insanity (i.e., emotional disorders) from mental
deficiency (i.e., intellectual deficits present from birth). He also believed that mental retardation
existed on a continuum from mild to profound and observed that verbal skills were the most reli-
able way of identifying the degree of mental retardation. In the 1890s Emil Kraepelin and others
promoted the use of free-association tests in assessing psychiatric patients. Free-association
tests involve the presentation of stimulus words to which the respondent responds “with the
first word that comes to mind.” Later Sigmund Freud expanded on the technique encouraging
patients to express freely any and all thoughts that came to mind in order to identify underlying
thoughts and emotions.
BRASS INSTRUMENTS ERA. Early experimental psychologists such as Wilhelm Wundt, Sir
Francis Galton, James McKeen Cattell, and Clark Wissler made significant contributions to the
development of cognitive ability testing. One of the most important developments of this period
was the move toward measuring human abilities using objective procedures that could be easily
replicated. These early pioneers used a variety of instru-
ments, often made of brass, to measure simple sensory
and motor processes based on the assumption that they Galton is considered the
were measures of general intelligence (e.g., Gregory, founder of mental tests
2004). Some of these early psychologists’ contributions and measurement and was
were so substantive that they deserve special mention. responsible for the first large-
Sir Francis Galton. Galton is often considered the scale systematic collection of
founder of mental tests and measurement. One of his data on individual differences.
major accomplishments was the establishment of an
5
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
anthropometric laboratory at the International Health Exhibition in London in 1884. This labo-
ratory was subsequently moved to a museum where it operated for several years. During this
time data were collected including physical (e.g., height, weight, head circumference), sensory
(e.g., reaction time, sensory discrimination), and motor measurements (e.g., motor speed, grip
strength) on over 17,000 individuals. This represented the first large-scale systematic collection
of data on individual differences (Anastasi & Urbina, 1997; Gregory, 2004).
James McKeen Cattell. Cattell shared Galton’s belief that relatively simple sensory and motor
tests could be used to measure intellectual abilities. Cattell was instrumental in opening psy-
chological laboratories and spreading the growing testing movement in the United States. He is
thought to be the first to use the term mental test in an article he published in 1890. In addition
to his personal professional contributions, he had several students who went on to productive
careers in psychology, including E. L. Thorndike, R. S. Woodworth, and E. K. Strong (Anastasi
& Urbina, 1997; Gregory, 2004). Galton and Cattell also contributed to the development of test-
ing procedures such as standardized questionnaires and rating scales that later became popular
techniques in personality assessment (Anastasi & Urbina, 1997).
Clark Wissler. Wissler was one of Cattell’s students whose research largely discredited the work
of his famous teacher. Wissler found that the sensory-motor measures commonly being used to
assess intelligence had essentially no correlation with academic achievement. He also found that
the sensory-motor tests had only weak correlations with one another. These discouraging find-
ings essentially ended the use of the simple sensory-motor measures of intelligence and set the
stage for a new approach to intellectual assessment that emphasized more sophisticated higher
order mental process. Ironically, there were significant methodological flaws in Wissler’s re-
search that prevented him from detecting moderate correlations that actually exist between some
sensory-motor tests and intelligence (to learn more about this interesting turn of events, see Fan-
cher, 1985, and Sternberg, 1990). Nevertheless, it would take decades for researchers to discover
that they might have dismissed the importance of psychophysical measurements in investigating
intelligence, and the stage was set for Alfred Binet’s approach to intelligence testing emphasizing
higher order mental abilities (Gregory, 2004).
Twentieth-Century Testing
ALFRED BINET—BRING ON INTELLIGENCE TESTING! Binet initially experimented with sensory-
motor measurements such as reaction time and sensory acuity, but he became disenchanted with
them and pioneered the use of measures of higher order cognitive processes to assess intelligence
(Gregory, 2004). In the early 1900s the French government commissioned Binet and his col-
league Theodore Simon to develop a test to predict academic performance. The result of their ef-
forts was the first Binet-Simon Scale, released in 1905.
The scale contained some sensory-perceptual tests, but
The first Binet-Simon Scale the emphasis was on verbal items assessing compre-
was released in 1905 and hension, reasoning, judgment, and short-term memory.
was the first intelligence test Binet and Simon achieved their goal and developed
that was a good predictor of a test that was a good predictor of academic success.
academic success. Subsequent revisions of the Binet-Simon Scale were
released in 1908 and 1911. These scales gained wide
6
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
acceptance in France and were soon translated and standardized in the United States, most suc-
cessfully by Louis Terman at Stanford University. This resulted in the Stanford-Binet Intel-
ligence Scale, which has been revised numerous times (the fifth revision, SB5, remains in use
today). Ironically, Terman’s version of the Binet-Simon Scale became even more popular in
France and other parts of Europe than the original scale.
ARMY ALPHA AND BETA TESTS. Intelligence testing received another boost in the United States
during World War I. The U.S. Army needed a way to assess and classify recruits as suitable for
the military and to classify them for jobs in the military. The American Psychological Associa-
tion (APA) and one of its past presidents, Robert M. Yerkes, developed a task force that devised
a series of aptitude tests that came to be known as the Army Alpha and Army Beta—one was ver-
bal (Alpha) and one nonverbal (Beta). Through their efforts and those of the Army in screening
recruits literally millions of Americans became familiar with the concept of intelligence testing.
RORSCHACH INKBLOT TEST. Hermann Rorschach developed the Rorschach inkblots in the
1920s. There has been considerable debate about the psychometric properties of the Rorschach
(and other projective techniques), but it continues to be one of the more popular personality as-
sessment techniques in use at the beginning of the 21st century.
COLLEGE ADMISSION TESTS. The College Entrance Examination Board (CEEB) was originally
formed to provide colleges and universities with an objective and valid measure of students’
academic abilities and to move away from nepotism and legacy in admissions to academic merit.
Its efforts resulted in the development of the first Scholastic Aptitude Test (SAT) in 1926 (now
called the Scholastic Assessment Test). The American College Testing Program (ACT) was ini-
tiated in 1959 and is the major competitor of the SAT. Prior to the advent of these tests, college
admissions decisions were highly subjective and strongly influenced by family background and
status, so another purpose for the development of these instruments was to make the selection
process increasingly objective.
WECHSLER INTELLIGENCE SCALES. Intelligence testing received another boost in the 1930s,
when David Wechsler developed an intelligence test that included measures of verbal ability and
nonverbal on the same test. Prior to Wechsler, and the Wechsler-Bellevue I, intelligence tests
typically assessed verbal or nonverbal intelligence, not both. The Wechsler scales have become
the most popular intelligence tests in use today.
7
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
test (i.e., can be scored in an objective manner) and has been the subject of a large amount of
research. Its second edition, the MMPI-2, continues to be one of the most popular (if not the most
popular) personality assessments in use today.
Twenty-First-Century Testing
The last 60 years have seen an explosion in terms of test development and use of psychologi-
cal and educational tests. For example, a recent search of the Mental Measurements Yearbook
resulted in the identification of over 600 tests listed in the category on personality tests and
over 400 in the categories of intelligence and aptitude tests. Later in this chapter we will ex-
amine some current trends in assessment and some factors that we expect will influence the
trajectory of assessment practices in the 21st century. We have summarized this time line for
you in Table 1.
8
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
MEASUREMENT Measurement can be defined as a set of rules for assigning numbers to repre-
sent objects, traits, attributes, or behaviors. A psychological test is a measuring device, and there-
fore involves rules (e.g., administration guidelines and scoring criteria) for assigning numbers
that represent an individual’s performance. In turn, these numbers are interpreted as reflecting
characteristics of the test taker. For example, the number of items endorsed in a positive manner
(e.g., “True” or “Like Me”) on a depression scale might be interpreted as reflecting a client’s
experience of depressed mood.
9
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
10
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
relatively free from measurement errors will be stable or consistent (i.e., reliable) and are thus
more accurate in estimating some value. Validity, in simplest terms, refers to the appropriate-
ness or accuracy of the interpretations of test scores. If
test scores are interpreted as reflecting intelligence, do Validity refers to the accuracy
they actually reflect intellectual ability? If test scores of the interpretation of test
are used to predict success on a job, can they accurately scores.
predict who will be successful on the job?
Types of Tests
We defined a test as a device or procedure in which a sample of an individual’s behavior is
obtained, evaluated, and scored using standardized procedures (AERA et al., 1999). You have
probably taken a large number of tests in your life, and it is likely that you have noticed that all
tests are not alike. For example, people take tests in schools that help determine their grades,
a test to obtain a driver’s license, interest inventories to help make educational and vocational
decisions, admissions tests when applying for college, exams to obtain professional certificates
and licenses, and personality tests to gain personal understanding. This brief list is clearly not
exhaustive!
Cronbach (1990) noted that tests generally can
be classified as measures of either maximum perform- Maximum performance tests
ance or typical response. Maximum performance tests are designed to assess the
also are referred to as ability tests, but achievement upper limits of the examinee's
tests are included here as well. On maximum perform- knowledge and abilities.
ance tests items may be scored as either “correct” or
“incorrect” and examinees are encouraged to demon-
strate their very best performance. Maximum performance tests are designed to assess the
upper limits of the examinee’s knowledge and abilities. For example, maximum performance
tests can be designed to assess how well a student can perform selected tasks (e.g., 3-digit
multiplication) or has mastered a specified content domain (e.g., American history). Intelli-
gence tests and classroom achievement tests are common examples of maximum performance
tests. In contrast, typical response tests attempt to measure the typical behavior and charac-
teristics of examinees. Often, typical response tests are referred to as personality tests, and in
this context personality is used broadly to reflect a host of noncognitive characteristics such
as attitudes, behaviors, emotions, and interests (Anastasi & Urbina, 1997). Some individuals
reserve the term test for maximum performance measures, while using terms such as scale
or inventory when referring to typical response instruments. In this textbook we use the term
test in its broader sense, applying it to both maximum performance and typical response
procedures.
MAXIMUM PERFORMANCE TESTS. Maximum performance tests are designed to assess the
upper limits of the examinee’s knowledge and abilities. Within the broad category of maximum
performance tests, there are a number of subcategories. First, maximum performance tests are
often classified as either achievement tests or aptitude tests. Second, maximum performance tests
can be classified as either objective or subjective. Finally, maximum performance tests are often
described as either speed or power tests. These distinctions, although not absolute in nature, have
11
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
12
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT
13