0% found this document useful (0 votes)
65 views

Mastering Modern Psychological Testing Theory Methods

Uploaded by

studysources.int
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Mastering Modern Psychological Testing Theory Methods

Uploaded by

studysources.int
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Introduction to

Psychological Assessment
Why We Do It and What It Is
Why do I need to learn about testing and assessment?

Chapter Outline
Brief History of Testing Participants in the Assessment Process
The Language of Assessment Psychological Assessment in the 21st Century
Assumptions of Psychological Assessment Summary
Why Use Tests?
Common Applications of Psychological
Assessments

Learning Objectives
After reading and studying this chapter, students 5. Describe and explain the assumptions underlying
should be able to: psychological assessment.
1. Describe major milestones in the history of testing. 6. Describe and explain the major applications of
2. Define test, measurement, and assessment. psychological assessments.
3. Describe and give examples of different types of 7. Explain why psychologists use tests.
tests. 8. Describe the major participants in the assessment
4. Describe and give examples of different types of process.
score interpretations. 9. Describe some major trends in assessment.

From Chapter 1 of Mastering Modern Psychological Testing: Theory & Methods, First Edition. Cecil R. Reynolds,
Ronald B. Livingston. Copyright © 2012 by Pearson Education, Inc. All rights reserved.
1
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

Most psychology students are drawn to the study of psychology because they want to
work with and help people, or alternatively to achieve an improved understanding of their own
thoughts, feelings, and behaviors. Many of these students aspire to be psychologists or counselors
and work in clinical settings. Other psychology students are primarily interested in research and
aspire to work at a university or other research institu-
tion. However, only a minority of psychology students
Psychological testing and have a burning desire to specialize in psychological
assessment are important tests and measurement. As a result, we often hear our
in virtually every aspect of students ask, “Why do I have to take a course in tests
professional psychology. and measurement?” This is a reasonable question,
so when we teach test and measurement courses we
spend some time explaining to our students why they
need to learn about testing and assessment. This is one of the major goals of this chapter. Hope-
fully we will convince you this is a worthwhile endeavor.
Psychological testing and assessment are important in virtually every aspect of profession-
al psychology. For those of you interested in clinical, counseling, or school psychology, research
has shown assessment is surpassed only by psychotherapy in terms of professional importance
(e.g., Norcross, Karg, & Prochaska, 1997; Phelps, Eisman, & Kohout, 1998). However, even this
finding does not give full credit to the important role assessment plays in professional practice.
As Meyer et al. (2001) observed, unlike psychotherapy, formal assessment is a unique feature
of the practice of psychology. That is, whereas a number of other mental health professionals
provide psychotherapy (e.g., psychiatrists, counselors, and social workers), psychologists are
the only mental health professionals who routinely conduct formal assessments. Special Interest
Topic 1 provides information about graduate psychology programs that train clinical, counseling,
and school psychologists.
The use of psychological tests and other assessments is not limited to clinical or health
settings. For example:
● Industrial and organizational psychologists devote a considerable amount of their profes-
sional time developing, administering, and interpreting tests. A major aspect of their work
is to develop assessments that will help identify prospective employees who possess the
skills and characteristics necessary to be successful on the job.
● Research psychologists also need to be proficient in measurement and assessment. Most
psychological research, regardless of its focus (e.g., social interactions, child development,
psychopathology, or animal behavior) involves measurement and/or assessment. Whether
a researcher is concerned with behavioral response time, visual acuity, intelligence, or
depressed mood (just to name a few), he or she will need to engage in measurement as
part of the research. In fact, all areas of science are very concerned and dependent on
measurement. Before one can study any variable in any science, it is necessary to be able
to ascertain its existence and measure important characteristics of the variable, construct,
or entity.
● Educational psychologists, who focus on the application of psychology in educational set-
tings, are often intricately involved in testing, measurement, and assessment issues. Their
involvement ranges from developing and analyzing tests that are used in educational set-
tings to educating teachers about how to develop better classroom assessments.

2
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

SPECIAL INTEREST TOPIC 1


How Do Clinical, Counseling, and School Psychologists Differ?
In the introductory section of this chapter we discussed the important role that testing and assess-
ment plays in professional psychology. The American Psychological Association accredits professional
training programs in the areas of clinical, counseling, and school psychology. Although there are many
other types of psychology programs (e.g., social psychology, quantitative psychology, and physiological
psychology), most psychologists working with patients or clients are trained in a clinical, counseling, or
school psychology program. Some refer to these programs as “practice-oriented” because their gradu-
ates are trained to provide psychological health care services. Graduates of doctoral-level clinical, coun-
seling, and school psychology programs typically qualify for equivalent professional benefits, such as
professional licensing, independent practice, and eligibility for insurance reimbursement (e.g., Norcross,
2000). However, there are some substantive differences. If you are considering a career in psychology,
and you would like to work in applied settings with individuals with emotional or behavioral problems,
it is helpful to understand the difference among these training programs.
Before describing how clinical, counseling, and school psychology programs differ, it might be
helpful to distinguish between programs that confer a PhD (Doctor of Philosophy) and those that pro-
vide a PsyD (Doctor of Psychology). PhD programs provide broad training in both clinical and research
applications (i.e., scientist-practitioner model) whereas PsyD programs typically focus primarily on clinical
training and place less emphasis on research. There are PhD and PsyD programs in clinical, counseling,
and school psychology, and the relative merits of the PhD versus the PsyD are often fiercely debated.
We will not delve into this debate at this time, but do encourage you to seek out advice from trusted
professors as to which degree is most likely to help you reach your career goals.
At this point, it is probably useful to distinguish between clinical and counseling psychology
programs. Historically, clinical psychology programs trained psychologists to work with clients with the
more severe forms of psychopathology (e.g., schizophrenia, bipolar disorder, dementia) whereas coun-
seling psychology programs trained psychologists to work with clients with less severe problems (e.g.,
adjustment disorders, career and educational counseling, couple/marriage counseling). You still see this
distinction to some degree, but the differences have diminished over the years. In fact, the American
Psychological Association (APA) stopped distinguishing between clinical and counseling psychology in-
ternships many years ago. Noting the growing similarity of these programs, Norcross (2000) stated that
some notable differences still exist. These include:
◆ Clinical psychology programs are more abundant and produce more graduates. There are 194
APA-accredited clinical psychology programs and 64 APA-accredited counseling psychology
programs.
◆ In terms of psychological assessment, clinical psychologists tend to use more projective personal-
ity assessments whereas counseling psychologists use more career and vocational assessments.
◆ In terms of theoretical orientation, the majority of both clinical and counseling psychologists favor
an eclectic/integrative or cognitive-behavioral approach. However, clinical psychologists are more
likely to endorse a psychoanalytic or behavioral orientation, whereas counseling psychologists
tends to favor client-centered or humanistic approaches.
◆ Clinical psychologists are more likely to work in private practice, hospitals, or medical schools,
whereas counseling psychologists are more likely to work in university counseling centers and
community mental health settings.
◆ Students entering clinical and counseling programs are similar in terms of GRE scores and under-
graduate GPA. However, it is noted that more students with master's degrees enter counseling
programs than clinical programs (67% for PhD counseling programs vs. 21% for PhD clinical
programs).
(Continued)

3
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

SPECIAL INTEREST TOPIC 1 (Continued)


In summary, even though clinical and counseling psychology programs have grown in similar-
ity in recent years, some important differences still exist. We will now turn to the field of school psy-
chology. School psychology programs prepare professionals to work with children and adolescents in
school settings. Clinical and counseling programs can prepare their graduates to work with children
and adolescents; however, this is the focus of school psychology programs. Like clinical and counseling
psychologists, school psychologists are trained in psychotherapy and counseling techniques. They also
receive extensive training in psychological assessment (cognitive, emotional, and behavioral) and learn
to consult with parents and other professionals to promote the school success of their clients. There are
currently 57 APA-accredited school psychology training programs.
This discussion has focused on only doctorate-level training programs. There are many more
master's-level psychology programs that also prepare students for careers as mental health profession-
als. Because professional licensing is controlled by the individual states, each state determines the edu-
cational and training criteria for licensing, and this is something else one should take into consideration
when choosing a graduate school.
You can learn more details about these areas of psychological specialization at the website of the
American Psychological Association (http://www.apa.org). The official definitions of each specialty area
are housed here as well as other archival documents that describe each specialty in detail that have been
accepted by the APA's Commission for the Recognition of Specialties and Proficiencies in Professional
Psychology (CRSPPP).

In summary, if you are interested in pursuing a career in psychology, it is likely that you
will engage in assessment to some degree, or at least need to understand the outcome of testing
and assessment activities. Before delving into contemporary issues in testing and assessment we
will briefly examine the history of psychological testing.

BRIEF HISTORY OF TESTING


Anastasi and Urbina (1997) stated that the actual “roots of testing are lost in antiquity” (p. 32).
Some writers suggest the first test was actually the famous “Apple Test” given to Eve in the
Garden of Eden. However, if one excludes biblical references, testing is usually traced back
to the early Chinese. The following section highlights some of the milestones in the history of
testing.

Earliest Testing: Circa 2200 BC

The earliest documented use of tests is usually attributed to the Chinese who tested public officials
to ensure competence (analogous to contemporary civil service examinations). The Chinese
testing program evolved over the centuries and assumed various forms. For example, during the
Han dynasty written exams were included for the first
time and covered five major areas: agriculture, civil
The earliest documented use law, geography, military affairs, and revenue. During
of tests is usually attributed to the fourth century the program involved three arduous
the Chinese who tested public stages requiring the examinees to spend days isolated
officials to ensure competence. in small booths composing essays and poems (Gregory,
2004).

4
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

Eighteenth- and Nineteenth-Century Testing


CARL FREDERICH GAUSS. Gauss (1777–1855) was a noted German mathematician who also
made important contributions in astronomy and the study of magnetism. In the course of tracking
star movements he found that his colleagues often came up with slightly different locations. He
plotted the frequency of the observed locations systematically and found the observations to take
the shape of a curve—the curve we have come to know as the normal curve or normal distribu-
tion (also known as the Gaussian curve). He determined that the best estimate of the precise
location of the star was the mean of the observations and that each independent observation
contained some degree of error. Although Gauss is not typically recognized as a pioneer in test-
ing, we believe his formal recognition of measurement error and its distributional characteristics
earns him this recognition.

CIVIL SERVICE EXAMINATIONS. Civil service tests similar to those used in China to select
government employees were introduced in European countries in the late 18th and early 19th
centuries. In 1883 the U.S. Civil Service Commission started using similar achievement tests to
aid selection of government employees (Anastasi & Urbina, 1997).

PHYSICIANS AND PSYCHIATRISTS. In the 19th century physicians and psychiatrists in England
and the United States developed classification systems to help classify individuals with mental
retardation and other mental problems. For example, in the 1830s the French physician Jean
Esquirol was one of the first to distinguish insanity (i.e., emotional disorders) from mental
deficiency (i.e., intellectual deficits present from birth). He also believed that mental retardation
existed on a continuum from mild to profound and observed that verbal skills were the most reli-
able way of identifying the degree of mental retardation. In the 1890s Emil Kraepelin and others
promoted the use of free-association tests in assessing psychiatric patients. Free-association
tests involve the presentation of stimulus words to which the respondent responds “with the
first word that comes to mind.” Later Sigmund Freud expanded on the technique encouraging
patients to express freely any and all thoughts that came to mind in order to identify underlying
thoughts and emotions.

BRASS INSTRUMENTS ERA. Early experimental psychologists such as Wilhelm Wundt, Sir
Francis Galton, James McKeen Cattell, and Clark Wissler made significant contributions to the
development of cognitive ability testing. One of the most important developments of this period
was the move toward measuring human abilities using objective procedures that could be easily
replicated. These early pioneers used a variety of instru-
ments, often made of brass, to measure simple sensory
and motor processes based on the assumption that they Galton is considered the
were measures of general intelligence (e.g., Gregory, founder of mental tests
2004). Some of these early psychologists’ contributions and measurement and was
were so substantive that they deserve special mention. responsible for the first large-
Sir Francis Galton. Galton is often considered the scale systematic collection of
founder of mental tests and measurement. One of his data on individual differences.
major accomplishments was the establishment of an

5
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

anthropometric laboratory at the International Health Exhibition in London in 1884. This labo-
ratory was subsequently moved to a museum where it operated for several years. During this
time data were collected including physical (e.g., height, weight, head circumference), sensory
(e.g., reaction time, sensory discrimination), and motor measurements (e.g., motor speed, grip
strength) on over 17,000 individuals. This represented the first large-scale systematic collection
of data on individual differences (Anastasi & Urbina, 1997; Gregory, 2004).
James McKeen Cattell. Cattell shared Galton’s belief that relatively simple sensory and motor
tests could be used to measure intellectual abilities. Cattell was instrumental in opening psy-
chological laboratories and spreading the growing testing movement in the United States. He is
thought to be the first to use the term mental test in an article he published in 1890. In addition
to his personal professional contributions, he had several students who went on to productive
careers in psychology, including E. L. Thorndike, R. S. Woodworth, and E. K. Strong (Anastasi
& Urbina, 1997; Gregory, 2004). Galton and Cattell also contributed to the development of test-
ing procedures such as standardized questionnaires and rating scales that later became popular
techniques in personality assessment (Anastasi & Urbina, 1997).
Clark Wissler. Wissler was one of Cattell’s students whose research largely discredited the work
of his famous teacher. Wissler found that the sensory-motor measures commonly being used to
assess intelligence had essentially no correlation with academic achievement. He also found that
the sensory-motor tests had only weak correlations with one another. These discouraging find-
ings essentially ended the use of the simple sensory-motor measures of intelligence and set the
stage for a new approach to intellectual assessment that emphasized more sophisticated higher
order mental process. Ironically, there were significant methodological flaws in Wissler’s re-
search that prevented him from detecting moderate correlations that actually exist between some
sensory-motor tests and intelligence (to learn more about this interesting turn of events, see Fan-
cher, 1985, and Sternberg, 1990). Nevertheless, it would take decades for researchers to discover
that they might have dismissed the importance of psychophysical measurements in investigating
intelligence, and the stage was set for Alfred Binet’s approach to intelligence testing emphasizing
higher order mental abilities (Gregory, 2004).

Twentieth-Century Testing
ALFRED BINET—BRING ON INTELLIGENCE TESTING! Binet initially experimented with sensory-
motor measurements such as reaction time and sensory acuity, but he became disenchanted with
them and pioneered the use of measures of higher order cognitive processes to assess intelligence
(Gregory, 2004). In the early 1900s the French government commissioned Binet and his col-
league Theodore Simon to develop a test to predict academic performance. The result of their ef-
forts was the first Binet-Simon Scale, released in 1905.
The scale contained some sensory-perceptual tests, but
The first Binet-Simon Scale the emphasis was on verbal items assessing compre-
was released in 1905 and hension, reasoning, judgment, and short-term memory.
was the first intelligence test Binet and Simon achieved their goal and developed
that was a good predictor of a test that was a good predictor of academic success.
academic success. Subsequent revisions of the Binet-Simon Scale were
released in 1908 and 1911. These scales gained wide

6
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

acceptance in France and were soon translated and standardized in the United States, most suc-
cessfully by Louis Terman at Stanford University. This resulted in the Stanford-Binet Intel-
ligence Scale, which has been revised numerous times (the fifth revision, SB5, remains in use
today). Ironically, Terman’s version of the Binet-Simon Scale became even more popular in
France and other parts of Europe than the original scale.

ARMY ALPHA AND BETA TESTS. Intelligence testing received another boost in the United States
during World War I. The U.S. Army needed a way to assess and classify recruits as suitable for
the military and to classify them for jobs in the military. The American Psychological Associa-
tion (APA) and one of its past presidents, Robert M. Yerkes, developed a task force that devised
a series of aptitude tests that came to be known as the Army Alpha and Army Beta—one was ver-
bal (Alpha) and one nonverbal (Beta). Through their efforts and those of the Army in screening
recruits literally millions of Americans became familiar with the concept of intelligence testing.

ROBERT WOODWORTH—BRING ON PERSONALITY TESTING! In 1918 Robert Woodworth de-


veloped the Woodworth Personal Data Sheet, which is widely considered to be the first formal
personality test. The Personal Data Sheet was designed to help collect personal information about
military recruits. Much as the development of the Binet scales ushered in the era of intelligence
testing, the introduction of the Woodworth Personal Data Sheet ushered in the era of personality
assessment.

RORSCHACH INKBLOT TEST. Hermann Rorschach developed the Rorschach inkblots in the
1920s. There has been considerable debate about the psychometric properties of the Rorschach
(and other projective techniques), but it continues to be one of the more popular personality as-
sessment techniques in use at the beginning of the 21st century.

COLLEGE ADMISSION TESTS. The College Entrance Examination Board (CEEB) was originally
formed to provide colleges and universities with an objective and valid measure of students’
academic abilities and to move away from nepotism and legacy in admissions to academic merit.
Its efforts resulted in the development of the first Scholastic Aptitude Test (SAT) in 1926 (now
called the Scholastic Assessment Test). The American College Testing Program (ACT) was ini-
tiated in 1959 and is the major competitor of the SAT. Prior to the advent of these tests, college
admissions decisions were highly subjective and strongly influenced by family background and
status, so another purpose for the development of these instruments was to make the selection
process increasingly objective.

WECHSLER INTELLIGENCE SCALES. Intelligence testing received another boost in the 1930s,
when David Wechsler developed an intelligence test that included measures of verbal ability and
nonverbal on the same test. Prior to Wechsler, and the Wechsler-Bellevue I, intelligence tests
typically assessed verbal or nonverbal intelligence, not both. The Wechsler scales have become
the most popular intelligence tests in use today.

MINNESOTA MULTIPHASIC PERSONALITY INVENTORY (MMPI). The MMPI was published


in the early 1940s to aid in the diagnosis of psychiatric disorders. It is an objective personality

7
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

TABLE 1 Milestones in Testing History


Circa 2200 BC Chinese test public officials
Late 1800s & Early Carl Frederich Gauss discovers the normal distribution when evaluating measurement error
1900s Civil service exams used in Europe
19th Century Physicians and psychiatrists assess mental patients with new techniques
Brass instruments era—emphasis on measuring sensory and motor abilities
Civil service exams initiated in United States in 1883
Early attention to questionnaires and rating scales by Galton and Cattell
1905 Binet-Simon Scale released—ushers in the era of intelligence testing
1917 Army Alpha and Beta released—first group of intelligence tests
1918 Woodworth Personal Data Sheet released—ushers in the era of personality assessment
1920s Scholastic Aptitude Test (SAT) and Rorshach inkblot test developed—testing expands its
influence
1930s David Wechsler releases the Wechsler-Bellevue I—initiates a series of influential
intelligence tests
1940s The Minnesota Multiphasic Personality Inventory (MMPI) released—destined to become
the leading objective personality inventory

test (i.e., can be scored in an objective manner) and has been the subject of a large amount of
research. Its second edition, the MMPI-2, continues to be one of the most popular (if not the most
popular) personality assessments in use today.

Twenty-First-Century Testing
The last 60 years have seen an explosion in terms of test development and use of psychologi-
cal and educational tests. For example, a recent search of the Mental Measurements Yearbook
resulted in the identification of over 600 tests listed in the category on personality tests and
over 400 in the categories of intelligence and aptitude tests. Later in this chapter we will ex-
amine some current trends in assessment and some factors that we expect will influence the
trajectory of assessment practices in the 21st century. We have summarized this time line for
you in Table 1.

THE LANGUAGE OF ASSESSMENT


We have already used a number of relatively common but somewhat technical terms. Before
proceeding it would be helpful to define them for you.

Tests, Measurement, and Assessment


TESTS A test is a device or procedure in which a sample of an individual’s behavior is
obtained, evaluated, and scored using standardized procedures (American Educational Research

8
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

Association [AERA], American Psychological Asso-


ciation [APA], & National Council on Measurement A test is a procedure in which
in Education [NCME], 1999). This is a rather broad or a sample of an individual's
general definition, but at this point in our discussion we behavior is obtained, evaluated,
are best served with this broad definition. Rest assured and scored using standardized
that we will provide more specific information on dif- procedures (AERA et al., 1999).
ferent types of tests in due time. Before proceeding we
should elaborate on one aspect of our definition of a
test: that a test is a sample of behavior. Because a test is only a sample of behavior, it is important
that tests reflect a representative sample of the behavior you are interested in. The importance
of the concept of a representative sample will become more apparent as we proceed with our
study of testing and assessment, and we will touch on it in more detail in later chapters when we
address the technical properties of tests.
Standardized Tests A standardized test is a test that is
administered, scored, and interpreted in a standard man- Measurement is defined as
ner. Most standardized tests are developed by testing a set of rules for assigning
professionals or test publishing companies. The goal of numbers to represent objects,
standardization is to ensure that testing conditions are as traits, attributes, or behaviors.
nearly the same as is possible for all individuals taking
the test. If this is accomplished, no examinee will have
an advantage over another due to variance in administration procedures, and assessment results
will be comparable.

MEASUREMENT Measurement can be defined as a set of rules for assigning numbers to repre-
sent objects, traits, attributes, or behaviors. A psychological test is a measuring device, and there-
fore involves rules (e.g., administration guidelines and scoring criteria) for assigning numbers
that represent an individual’s performance. In turn, these numbers are interpreted as reflecting
characteristics of the test taker. For example, the number of items endorsed in a positive manner
(e.g., “True” or “Like Me”) on a depression scale might be interpreted as reflecting a client’s
experience of depressed mood.

ASSESSMENT Assessment is defined as a systematic


procedure for collecting information that can be used Assessment is defined as
to make inferences about the characteristics of people any systematic procedure for
or objects (AERA et al., 1999). Assessment should lead collecting information that can
to an increased understanding of these characteristics. be used to make inferences
Tests are obviously one systematic method of collect- about the characteristics of
ing information and are therefore one set of tools for
people or objects (AERA et al.,
assessment. Reviews of historical records, interviews,
1999).
and observations are also legitimate assessment tech-
niques and all are maximally useful when they are inte-
grated. In fact, assessment typically refers to a process that involves the integration of informa-
tion obtained from multiple sources using multiple methods. Therefore, assessment is a broader,
more comprehensive process than testing.

9
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

In contrasting psychological testing and psychological assessment, Meyer et al. (2001)


observed that testing is a relatively straightforward process where a specific test is adminis-
tered to obtain a specific score. In contrast, psychological assessment integrates multiple scores,
typically obtained using multiple tests, with information collected by reviewing records, con-
ducting interviews, and conducting observations. The goal is to develop a better understanding
of the client, answer referral questions (e.g., Why is this student doing poorly at school?), and
communicate these findings to the appropriate individuals.
McFall and Trent (1999) go a bit further and remind us that the “aim of clinical
assessment is to gather data that allow us to reduce uncertainty regarding the probabilities of
events” (p. 215). Such an “event” in a clinical setting might be the probability that a patient
has depression versus bipolar disorder versus dementia, all of which require treatments. In
personnel psychology it might be the probability that an applicant to the police force will be
successful completing the police academy, a very expensive training. Test scores are useful
in this process to the extent they “allow us to predict or control events with greater accuracy
or with less error than we could have done without them” (McFall & Trent, 1999, p. 217). In
reference to the previous events, diagnosis and employment selection, tests are useful in the
assessment process to the extent they make us more accurate in reaching diagnostic or hiring
decisions, respectively. Professionals in all areas make errors; our goal is to minimize these
mistakes in number and in magnitude. In psychology, testing and assessment, done properly,
are of great benefit to these goals.
Now that we have defined these common terms, with some reluctance we acknowledge
that in actual practice many professionals use testing, measurement, and assessment inter-
changeably. Recognizing this, Popham (2000) noted that among many professionals assess-
ment has become the preferred term. Measurement sounds rather rigid and sterile when applied
to people and tends to be avoided. Testing has its own negative connotations. For example,
hardly a week goes by when newspapers don’t contain articles about “teaching to the test” or
“high-stakes testing,” typically with negative connotations. Additionally, when people hear the
word test they usually think of paper-and-pencil tests. In recent years, as a result of growing
dissatisfaction with traditional paper-and-pencil tests, alternative testing procedures have been
developed (e.g., performance assessments and portfolios). As a result, testing is not seen as
particularly descriptive of modern practices. That leaves us with assessment as the contempo-
rary popular term.
There are some additional terms that you should be familiar with. Evaluation is a term
often used when discussing assessment, testing, and measurement-related issues. Evaluation is
an activity that involves judging or appraising the value or worth of something. For example,
assigning formal grades to students to reflect their academic performance is referred to as sum-
mative evaluation. Psychometrics is the science of psychological measurement, and a psychom-
etrician is a psychological or educational professional who has specialized in the area of testing,
measurement, and assessment. You will likely hear people refer to the psychometric properties
of a test, and by this they mean the measurement or statistical characteristics of a test. These
measurement characteristics include reliability and
validity. Reliability refers to the stability, consistency,
Reliability refers to the stability and relative accuracy of the test scores. On a more theo-
or consistency of test scores. retical level, reliability refers to the degree to which test
scores are free from measurement errors. Scores that are

10
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

relatively free from measurement errors will be stable or consistent (i.e., reliable) and are thus
more accurate in estimating some value. Validity, in simplest terms, refers to the appropriate-
ness or accuracy of the interpretations of test scores. If
test scores are interpreted as reflecting intelligence, do Validity refers to the accuracy
they actually reflect intellectual ability? If test scores of the interpretation of test
are used to predict success on a job, can they accurately scores.
predict who will be successful on the job?

Types of Tests
We defined a test as a device or procedure in which a sample of an individual’s behavior is
obtained, evaluated, and scored using standardized procedures (AERA et al., 1999). You have
probably taken a large number of tests in your life, and it is likely that you have noticed that all
tests are not alike. For example, people take tests in schools that help determine their grades,
a test to obtain a driver’s license, interest inventories to help make educational and vocational
decisions, admissions tests when applying for college, exams to obtain professional certificates
and licenses, and personality tests to gain personal understanding. This brief list is clearly not
exhaustive!
Cronbach (1990) noted that tests generally can
be classified as measures of either maximum perform- Maximum performance tests
ance or typical response. Maximum performance tests are designed to assess the
also are referred to as ability tests, but achievement upper limits of the examinee's
tests are included here as well. On maximum perform- knowledge and abilities.
ance tests items may be scored as either “correct” or
“incorrect” and examinees are encouraged to demon-
strate their very best performance. Maximum performance tests are designed to assess the
upper limits of the examinee’s knowledge and abilities. For example, maximum performance
tests can be designed to assess how well a student can perform selected tasks (e.g., 3-digit
multiplication) or has mastered a specified content domain (e.g., American history). Intelli-
gence tests and classroom achievement tests are common examples of maximum performance
tests. In contrast, typical response tests attempt to measure the typical behavior and charac-
teristics of examinees. Often, typical response tests are referred to as personality tests, and in
this context personality is used broadly to reflect a host of noncognitive characteristics such
as attitudes, behaviors, emotions, and interests (Anastasi & Urbina, 1997). Some individuals
reserve the term test for maximum performance measures, while using terms such as scale
or inventory when referring to typical response instruments. In this textbook we use the term
test in its broader sense, applying it to both maximum performance and typical response
procedures.

MAXIMUM PERFORMANCE TESTS. Maximum performance tests are designed to assess the
upper limits of the examinee’s knowledge and abilities. Within the broad category of maximum
performance tests, there are a number of subcategories. First, maximum performance tests are
often classified as either achievement tests or aptitude tests. Second, maximum performance tests
can be classified as either objective or subjective. Finally, maximum performance tests are often
described as either speed or power tests. These distinctions, although not absolute in nature, have

11
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

a long historical basis and provide some useful descrip-


Achievement tests measure tive information.
knowledge and skills in an area
Achievement and Aptitude Tests. Maximum perform-
in which instruction has been ance tests are often classified as either achievement
provided (AERA et al., 1999). tests or aptitude tests. Achievement tests are designed
to assess the knowledge or skills of an individual in
a content domain in which he or she has received
Aptitude tests measure instruction. In contrast, aptitude tests are broader
cognitive abilities and skills that in scope and are designed to measure the cognitive
are accumulated as the result of skills, abilities, and knowledge that an individual has
overall life experiences (AERA accumulated as the result of overall life experiences
et al., 1999). (AERA et al., 1999). In other words, achievement
tests are linked or tied to a specific program of
instructional objectives, whereas aptitude tests reflect
the cumulative impact of life experiences as a whole. This distinction, however, is not abso-
lute and is actually a matter of degree or emphasis. Most testing experts today conceptualize
both achievement and aptitude tests as measures of developed cognitive abilities that can be
ordered along a continuum in terms of how closely linked the assessed abilities are to specific
learning experiences.
Another distinction between achievement and aptitude tests involves the way their results
are used or interpreted. Achievement tests are typically used to measure what has been learned
or “achieved” at a specific point in time. In contrast, aptitude tests usually are used to predict
future performance or reflect an individual’s potential in terms of academic or job performance.
However, this distinction is not absolute either. As an example, a test given at the end of high
school to assess achievement might also be used to predict success in college. Although we feel
it is important to recognize that the distinction between achievement and academic tests is not
absolute, we also feel the achievement/aptitude distinction is useful when discussing different
types of abilities.
Objective and Subjective Tests. Objectivity typically implies impartiality or the absence of
personal bias. Cronbach (1990) stated that the less test scores are influenced by the subjec-
tive judgment of the person grading or scoring the test, the more objective the test is. In other
words, objectivity refers to the extent that trained examiners who score a test will be in agree-
ment and score responses in the same way. Tests with selected-response items (e.g., multiple-
choice, true–false, and matching) that can be scored using a fixed key and that minimize sub-
jectivity in scoring are often referred to as “objective” tests. In contrast, subjective tests are
those that rely on the personal judgment of the individual grading the test. For example, essay
tests are considered subjective because the person grading the test relies to some extent on his
or her own subjective judgment when scoring the essays. Most students are well aware that
different teachers might assign different grades to the same essay item. Essays and other test
formats that require the person grading the test to employ his or her own personal judgment
are often referred to as “subjective” tests. It is common, and desirable, for those developing
subjective tests to provide explicit scoring rubrics in an effort to reduce the impact of the
subjective judgment of the person scoring the test.

12
INTRODUCTION TO PSYCHOLOGICAL ASSESSMENT

Speed and Power Tests. Maximum performance tests


often are categorized as either speed tests or power On speed tests, performance
tests. On a pure speed test, performance only reflects reflects differences in the speed
differences in the speed of performance. A speed test of performance.
generally contains items that are relatively easy and
has a strict time limit that prevents any examinees
from successfully completing all the items. Speed tests On power tests, performance
are also commonly referred to as “speeded tests.” On
reflects the difficulty of the items
a pure power test, the speed of performance is not an
the examinee is able to answer
issue. Everyone is given plenty of time to attempt all
the items, but the items are ordered according to dif- correctly.
ficulty, and the test contains some items that are so
difficult that no examinee is expected to answer them all. As a result, performance on a power
test primarily reflects the difficulty of the items the examinee is able to answer correctly.
Well-developed speed and power tests are designed so no one will obtain a perfect score.
They are designed this way because perfect scores are “indeterminate.” That is, if someone ob-
tains a perfect score on a test, the test failed to assess the very upper limits of that person’s ability.
To access adequately the upper limits of ability, tests need to have what test experts refer to as
an “adequate ceiling.” That is, the difficulty level of the tests is set so none of the examinees will
be able to obtain a perfect score.
As you might expect, this distinction between speed and power tests is also one of degree
rather than being absolute. Most often a test is not a pure speed test or a pure power test, but
incorporates some combination of the two approaches. For example, the Scholastic Assessment
Test (SAT) and Graduate Record Examination (GRE) are considered power tests, but both have
time limits. When time limits are set such that 95% or more of examinees will have the oppor-
tunity to respond to all items, the test is still considered to be a power test and not a speed test.
TYPICAL RESPONSE TESTS. As noted, typical response tests are designed to measure the typical
behavior and characteristics of examinees. Typical response tests measure constructs such as per-
sonality, behavior, attitudes, or interests. In traditional assessment terminology, personality is a
general term that broadly encompasses a wide range of emotional, interpersonal, motivational,
attitudinal, and other personal characteristics (Anastasi
& Urbina, 1997). When describing personality tests,
most assessment experts distinguish between objective Typical response tests are
and projective techniques. Although there are some dif- designed to measure the typical
ferences, this distinction largely parallels the separation behavior and characteristics of
of maximum performance tests into “objective” or “sub- examinees.
jective” tests. These two approaches are described next.
Objective Personality Tests. As with maximum per-
formance tests, in the context of typical response Objective personality tests use
assessment objectivity also implies impartiality or items that are not influenced by
the absence of personal bias. Objective personality the subjective judgement of the
tests are those that use selected-response items (e.g., person scoring the test.
true–false) and are scored in an objective manner. For

13

You might also like