5 Assessment of Intellectual Functioning
5 Assessment of Intellectual Functioning
Life is not so much a conflict of intelligences as a combat of characters. The story behind the first intelligence tests is familiar to
ALFRED BINET & THÉODORE SIMON (1908/1916, p. 256) many psychologists (Wolf, 1973). In the fall of 1904, the
French Minister of Public Instruction appointed a commission
The study of intelligence and cognitive abilities dates back to study problems with the education of mentally retarded
more than a century and is characterized by the best and the children in Paris, in response to the failures of the children to
worst of science—scholarly debates and bitter rivalries, re- benefit from universal education laws. Alfred Binet, as an
search breakthroughs and academic fraud, major assessment educational activist and leader of the La Société Libre pour
paradigm shifts, and the birth of a commercial industry that l’Étude Psychologique de l’Enfant (Free Society for the Psy-
generates hundreds of millions of dollars in annual revenue. chological Study of the Child) was named to the commission.
Still struggling with unresolved matters dating from its birth, La Société had originally been founded to give teachers and
the study of intelligence has seen as many fallow periods in its school administrators an opportunity to discuss problems of
growth as it has seen steps forward. In this chapter, the history education and to collaborate in research. Binet’s appointment
and evolution of intelligence theory and applied intelligence to the commission was hardly an accident because members
testing are described, along with a vision of intelligence as a of La Société had already served as principal advocates
field of study grounded in theory and psychological science— with the ministry on behalf of schoolchildren. The commis-
aimed at facilitating clinical and educational decision making sion’s recommendations included what they called a medico-
related to classification and intervention. The essential re- pedagogical examination for children who do not benefit from
quirements of a mature clinical science, according to Millon education, teaching, or discipline, before such children were
(1999; Millon & Davis, 1996), are (a) a coherent foundational removed from primary schools and—if educable—placed in
theory, from which testable principles and propositions may be special classes. The commission did not offer any substance
derived; (b) a variety of assessment instruments, operational- for the examination, but having thought about intelligence for
izing the theory and serving the needs of special populations; over a decade, Binet decided to take advantage of the com-
(c) an applied diagnostic taxonomy, derived from and consis- mission mandate and undertake the task of developing a reli-
tent with the theory and its measures; and (d) a compendium of able diagnostic system with his colleague Théodore Simon.
change-oriented intervention techniques, aimed at modifying The first Binet-Simon Scale was completed in 1905, revised in
specific behaviors in a manner consistent with the theory. 1908, and revised again in 1911. By the completion of the
The study of intelligence has yet to claim status as a mature 1911 edition, Binet and Simon’s scale was extended through
clinical science, but some signs of progress are evident. adulthood and balanced with five items at each age level.
417
418 Assessment of Intellectual Functioning
Although many scholars in the United States were intro- albeit with some modifications made for the purposes of
duced to the new intelligence test through Binet’s journal clarity and precision. Emphasis is placed on the interpretive
L’Année Psychologique, it became widely known after Henry indexes that are central to the test but not on the plethora of
H. Goddard, Director of Research at the Training School for indexes that are available for some tests.
the Retarded at Vineland, New Jersey, arranged for his assis-
tant Elizabeth Kite to translate the 1908 scale. Impressed by its
Cognitive Assessment System
effectiveness in yielding scores in accord with the judgments
of senior clinicians, Goddard became a vocal advocate of the The Das-Naglieri Cognitive Assessment System (CAS;
test, distributing 22,000 copies and 88,000 response sheets by Naglieri & Das, 1997a) is a cognitive processing battery
1915. Within a few years, the test had changed the landscape intended for use with children and adolescents 5 through
for mental testing in the United States, spawning an entire test- 17 years of age. The origins of the CAS may be traced to the
ing industry and laying the groundwork for the proliferation in work of A. R. Luria, the preeminent Russian neuropsycholo-
intelligence and achievement tests after World War I. The gist whose work has been highly influential in American
most successful adaptation of the Binet-Simon Scale was the psychology (Solso & Hoffman, 1991). Beginning in 1972,
Stanford-Binet Intelligence Scale, which dominated intelli- Canadian scholar J. P. Das initiated a program of research
gence testing in the United States until the 1960s, when it was based upon the simultaneous and successive modes of infor-
overtaken in popularity by the Wechsler intelligence scales mation processing suggested by Luria. Ashman and Das
(Lubin, Wallis, & Paine, 1971). The Wechsler scales have re- (1980) first reported the addition of planning measures to the
mained firmly entrenched as the most widely used intelligence simultaneous-successive experimental tasks, and separate
tests in every subsequent psychological test usage survey. attention and planning tasks were developed by the end of
the decade (Naglieri & Das, 1987, 1988). The work of Luria
and Das influenced Alan and Nadeen Kaufman, who published
DESCRIPTIONS OF THE MAJOR the Kaufman Assessment Battery for Children (K-ABC; based
INTELLIGENCE TESTS on the sequential-simultaneous dichotomy discussed later in
this chapter) in 1983. Jack A. Naglieri, a former student of
In this section, six of the leading individually administered Kaufman’s who had assisted with the K-ABC development,
intelligence tests are described, along with the most common met J. P. Das in 1984 and began a collaboration to assess
ways to interpret them. The descriptions are limited to intel- Luria’s three functional systems. Thirteen years and more than
ligence tests that purport to be reasonably comprehensive and 100 studies later, the CAS was published. It is available in two
multidimensional, covering a variety of content areas; more batteries: an 8-subtest basic battery and a 12-subtest standard
specialized tests (such as nonverbal cognitive batteries) and battery.
group-administered tests (usually administered in large-scale
educational testing programs) have been excluded. Students
Theoretical Underpinnings
of intellectual assessment will notice considerable overlap
and redundancy between many of these instruments—in The CAS has its theoretical underpinnings in the work of
large part because they tend to measure similar psychological Luria’s (1973, 1980) three functional units in the brain:
constructs with similar procedures, and in many cases, they (a) The first unit regulates cortical tone and focus of atten-
have similar origins. With a few exceptions, most intelli- tion; (b) the second unit receives, processes, and retains in-
gence testing procedures can be traced to tasks developed formation in two basic modes (simultaneous and successive);
from the 1880s through the 1920s. and (c) the third unit involves the formation, execution, and
The tests are presented in alphabetical order. For each monitoring of behavioral planning. These processes are artic-
test, its history is briefly recounted followed by a description ulated and described in PASS theory, using the acronym for
of its theoretical underpinnings. Basic psychometric features planning, attention, simultaneous, and successive processing
including characteristics of standardization, reliability, and (Das, Naglieri, & Kirby, 1994).
validity are presented. Test administration is described but Of the theories and models associated with the major
not detailed because administration can only be learned intelligence instruments, PASS theory and Kaufman’s
through a careful reading of the test manuals, and every test sequential-simultaneous theory alone offer an approach with
seems to offer its own set of unique instructions. Core inter- articulated neurobiological underpinnings (although g theory
pretive indexes are also described in a way that is generally has numerous neurophysiological correlates). Moreover,
commensurate with descriptions provided in the test manuals, Luria’s approaches to restoration of function after brain injury
Descriptions of the Major Intelligence Tests 419
(Luria, 1963; Luria & Tsvetkova, 1990) has provided a basis Exploratory and confirmatory factor analyses of the CAS
for use of his theory to understand and implement treatment provide support for either a three- or four-factor solution
and intervention. Accordingly, PASS theory and sequential- (Naglieri & Das, 1997b). The four-factor solution is based
simultaneous theory have emphasized intervention more than upon the four PASS dimensions, whereas the three-factor
do other intelligence tests. solution combines Planning and Attention to form a single
dimension. The decision to use the four-factor solution was
Standardization Features and Psychometric Adequacy based upon the test’s underlying theory, meaningful discrep-
ancies between planning and attention performance in crite-
The CAS was standardized from 1993 through 1996 on 2,200 rion populations (e.g., individuals with ADHD, traumatic
children and adolescents from 5 through 17 years of age, brain injury), and differential response to treatments in inter-
stratified on 1990 census figures. Sample stratification vari- vention studies (e.g., planning-based intervention). On a data
ables included race, ethnicity, geographic region, community set based on the tryout version of the CAS, Carroll (1995)
setting, parent educational attainment, classroom placement, argued that the planning scale, which is timed, may best be
and educational classification. The standardization sample conceptualized as a measure of perceptual speed. Keith,
was evenly divided between males and females, with n = 300 Kranzler, and Flanagan have challenged the CAS factor struc-
for the earliest school-age levels and n = 200 for levels with ture based upon reanalyses of the standardization sample
older children and adolescents. Demographic characteristics and analysis with a new sample of n = 155 (Keith &
of the standardization sample are reported in detail across Kranzler, 1999; Keith, Kranzler, & Flanagan, 2001; Kranzler
stratification variables in the CAS interpretive handbook and & Keith, 1999; Kranzler, Keith, & Flanagan, 2000). These
closely match the targeted census figures (Naglieri & Das, investigations have variously reported that the PASS model
1997b). provides a better fit (but a less-than-optimal fit) to the stan-
The reliability of the CAS is fully adequate. Internal con- dardization data—but not the newer sample—than do several
sistency is computed through the split-half method with competing nonhierarchical models and that planning and
Spearman-Brown correction, and average reliabilities for the attention factors demonstrate inadequate specificity for sepa-
PASS and full-scale composite scores range from .84 (Atten- rate interpretation.
tion, Basic Battery) to .96 (full-scale, Standard Battery). The CAS has also been studied with several special popu-
Average subtest reliability coefficients range from .75 to lations, including children and adolescents with ADHD,
.89, with a median reliability of .82. Score stability coeffi- reading disabilities, mental retardation, traumatic brain in-
cients were measured with a test-retest interval from 9 to jury, serious emotional disturbance, and intellectual gifted-
73 days, with a median of 21 days. Corrected for variability ness. CAS is unique among tests of cognitive abilities and
of scores from the first testing, the stability coefficients have processes insofar as it has been studied with several research-
median values of .73 for the CAS subtests and .82 for the based programs of intervention, one of which is described at
Basic and Standard Battery PASS scales. the end of this chapter.
CAS floors and ceilings tend to be good. Test score floors
extend two or more standard deviations below the normative
Interpretive Indexes and Applications
mean, beginning with 6-year-old children; thus, discrimina-
tion at the lowest processing levels is somewhat limited with The CAS yields four standard scores corresponding to the
5-year-olds. Test score ceilings extend more than two stan- PASS processes, as well as a full-scale standard score. Al-
dard deviations above the mean at all age levels. though the subtests account for high levels of specific vari-
CAS full-scale standard scores correlate strongly with the ance, the focus of CAS interpretation is at the PASS scale
Wechsler Intelligence Scales for Children–Third Edition Full level—not at the subtest level or full-scale composite level.
Scale IQ (WISC-III FSIQ; r = .69), Woodcock-Johnson III PASS theory guides the examination of absolute and relative
Tests of Cognitive Abilities Brief Intellectual Ability (WJ III cognitive strengths and weaknesses. Table 18.1 contains
Cog; r = .70; from McGrew & Woodcock, 2001) and some- interpretations for each of the PASS scales.
what more moderately with the Wechsler Preschool and Pri- In general, children with diverse exceptionalities tend to
mary Scale of Intelligence Full Scale IQ (WPPSI FSIQ; show characteristic impairment on selected processes or com-
r = .60). Based upon a large sample (n = 1,600) used as a basis binations of processes. Children with a reading disability tend
for generating ability-achievement comparisons, the CAS full- as a group to obtain their lowest scores on measures of suc-
scale standard scores yield high correlations with broad read- cessive processing (Naglieri & Das, 1997b), presumably due
ing and broad mathematics achievement (r = .70–.72). to the slowed phonological temporal processing thresholds
420 Assessment of Intellectual Functioning
TABLE 18.1 Interpretive Indexes From the Das-Naglieri Cognitive planning subtests. The inclusion of age-referenced norms for
Assessment System (CAS; Naglieri & Das, 1997)
strategy usage provides an independent source of information
Composite Indexes Description about the efficiency, implementation, and maturity with which
Full Scale Complex mental activity involving the an individual approaches and performs complex tasks. Chil-
interaction of diverse cognitive processes. dren with ADHD, for example, tend to utilize developmentally
Planning The process by which an individual younger strategies during task performance (Wasserman,
determines, selects, applies, and evaluates
solutions to problems; involves generation Paolitto, & Becker, 1999).
of strategies, execution of plans, self-control, Through an emphasis on cognitive processes rather than
and self-monitoring. culture-anchored forms of acquired knowledge, CAS also
Attention The process of selectively focusing on offers an intellectual assessment approach that may reduce
particular stimuli while inhibiting response
to competing stimuli; involves directed
the disproportionately high number of minority children
concentration and sustained focus on placed in special education settings. Wasserman and Becker
important information. (2000) reported a mean 3.5 full-scale standard score differ-
Simultaneous Processing The process of integrating separate stimuli ence between demographically matched African Americans
into a single perceptual or conceptual
and Whites in the CAS standardization sample, compared to
whole; applies to comprehension of verbal
relationships and concepts, understanding an 11.0 difference previously reported using similar match-
of inflection, and working with spatial ing strategies with the WISC-III FSIQ (Prifitera, Weiss, &
information. Saklofske, 1998). The reduced race-based group mean score
Successive Processing The process of integrating stimuli into a differences for CAS relative to WISC-III have been found to
specific, temporal order that forms a chainlike
progression; involves sequential perception ameliorate the problem of disproportionate classification of
and organization of visual and auditory events African American minorities in special education programs
and execution of motor behaviors in order. for children with mental retardation (Naglieri & Rojahn,
2001). Accordingly, CAS offers promise in improving the
equity of intellectual assessments.
that have been identified as a processing deficit associating By virtue of its theoretical underpinnings and linkages to
with delayed reading acquisition (e.g., Anderson, Brown, & diagnosis and treatment (discussed later in this chapter), the
Tallal, 1993). Children diagnosed with the hyperactive- CAS builds upon the earlier advances offered by the K-ABC
impulsive subtype of ADHD tend to characteristically have (Kaufman & Kaufman, 1983a, 1983b). In a recent review,
weaknesses in planning and attention scales (Paolitto, 1999), Meikamp (1999) observed, “The CAS is an innovative
consistent with the newest theories reconceptualizing ADHD instrument and its development meets high standards of
as a disorder of executive functions (Barkley, 1997). Charac- technical adequacy. Despite interpretation cautions with ex-
teristic weaknesses in planning and attention have also been ceptional populations, this instrument creatively bridges the
reported in samples of children with traumatic brain injury gap between theory and applied psychology” (p. 77).
(Gutentag, Naglieri, & Yeates, 1998), consistent with the
frontal-temporal cortical impairment usually associated with Differential Ability Scales
closed head injury.
Like most of the other intelligence tests for children and The Differential Ability Scales (DAS; C. D. Elliott, 1990a,
adolescents, CAS is also empirically linked to an achieve- 1990b) offer ability profiling in 17 subtests divided into two
ment test (Woodcock-Johnson–Revised and the Woodcock- overlapping age levels and standardized for ages 2.5 through
Johnson III Tests of Achievement). Through the use of simple 17 years. It also includes several tests of school achievement
and predicted differences between ability and achievement, that are beyond the scope of this chapter. The DAS is a U.S.
children who qualify for special education services under adaptation, revision, and extension of the British Ability
various state guidelines for specific learning disabilities may Scales (BAS; C. D. Elliott, 1983). Development of the BAS
be identified. Moreover, CAS permits the identification of im- originally began in 1965, with a grant from the British
paired cognitive processes that may contribute to the learning Department of Education and Science to the British Psycho-
problems. In contrast, CAS has very low acquired knowledge logical Society to prepare an intelligence scale. Under the
requirements. direction of F. W. Warburton, more than 1,000 children were
CAS also provides normative reference for the use of tested with a series of tasks developed to measure Thurstone’s
metacognitive problem-solving strategies that may be ob- (1938) seven primary mental abilities and key dimensions
served by the examiner or reported by the examinee on from Piagetian theory. Following Warburton’s death, the
Descriptions of the Major Intelligence Tests 421
government grant was extended, and in 1973 Colin Elliott with the developmental tenet that cognitive abilities tend to
became the director of the project. The decision was made to become differentiated with maturation (Werner, 1948).
de-emphasize IQ estimation and to provide a profile of mean- At the diagnostic level are four preschool subtests and three
ingful and distinct subtest scores, resulting in the name school-age subtests, included on the basis of cognitive and
British Ability Scales. New subtests were created, and the use neuropsychological bodies of research. The preschool proce-
of item response theory was introduced in psychometric dures include measures of short-term memory in separate au-
analyses. Following a standardization of 3,435 children, the ditory, visual, and crossed modalities, as well as measures
first edition of the BAS was published in 1979, and an tapping visual-spatial abilities. The school-age procedures in-
amended revised edition was published in 1982. The develop- clude measures of short-term auditory memory, short-term
ment of the DAS began in 1984, in an effort to address the cross-modality memory, and processing speed. Each of the
strengths and weaknesses of the BAS and apply the test for subtests has adequate specific variance to be interpreted as an
use in the United States. To enhance clarity and diagnostic isolated strength or weakness.
utility, six BAS subtests were deleted and four new subtests
were added to create the DAS, which was published 25 years
Standardization Features and Psychometric Adequacy
after the work on the BAS began. The DAS Cognitive Battery
includes a preschool level beginning at age 2.5 and a school- The DAS was standardized from 1986 to 1989 on 3,475 chil-
age level beginning at age 6 years. dren and adolescents, with 175–200 examinees per age level.
The sample was balanced by age and sex, representative of
Theoretical Underpinnings 1988 U.S. census proportions, and stratified on race-ethnicity,
parent educational level, geographic region, and educational
The DAS was developed to accommodate diverse theoretical preschool and special education enrollment. The sample
perspectives and to permit interpretation at multiple levels of excluded children with severe handicaps or limited English
performance. It fits most closely with the work of the hierar- proficiency. The sample was largest for the preschool periods
chical multifactor theorists through its emphasis on a higher (n = 175 per 6-month interval), when cognitive development
order general intellectual factor (conventionally termed g) and is most rapid. The composition of the normative sample is
lower order broad cognitive factors. The DAS avoids use of the detailed across stratification variables in the DAS Intro-
terms intelligence and IQ to avoid traditional misconceptions, ductory and Technical Handbook (C. D. Elliott, 1990b) and
focusing instead on cognitive abilities and processes that are appears to closely match its target figures.
either strongly related to the general factor or thought to have The reliability of the DAS subtests and composites were
value for diagnostic purposes. The DAS is also characterized computed through innovative methodologies utilizing item
by an exceptionally high attention to technical qualities, and response theory (IRT). Specifically, DAS subtests are admin-
C. D. Elliott (1990b) was careful to ensure that all interpretive istered in predetermined item sets rather than with formal
indexes—from the cluster scores down to the diagnostic basal and discontinue rules; this means that starting points
subtests—have adequate reliable specificity to support indi- and stopping decision points (as well as alternative stopping
vidual interpretation. points) are designated on the record form according to the
The General Conceptual Ability (GCA) score captures test child’s age. If the child does not pass at least three items in
performance on subtests that have high g loadings, in contrast the item set, the examiner administers an easier set of items.
to tests such as the Wechsler scales in which all subtests (high Accordingly, children receive a form of tailored, adaptive
and low g loading) contribute to the overall index of composite testing, in which they are given items closest to their actual
intelligence. At the hierarchical level below a superordinate ability levels. Because IRT permits measurement precision to
general factor are cluster scores that have sufficient specific be computed at each level of ability (see the chapter by
variance for interpretation. For children from ages 2.5 years Wasserman & Bracken in this volume), it was possible for
through 3.5 years, only a single general factor may be derived C. D. Elliott (1990b) to provide reliability estimates that
from the DAS. For older preschool children, the clusters are are similar to conventional indexes of internal consistency.
verbal ability and nonverbal ability, roughly paralleling the IRT-based mean subtest reliabilities ranged from .71 to .88
verbal-performance dichotomy featured in the Wechsler intel- for the preschool battery and from .70 to .92 for the school-
ligence scales. For school-age children, the clusters are verbal age battery. Cluster and GCA reliabilities ranged from .81 to
ability, nonverbal reasoning ability, and spatial ability. The in- .94 for the preschool battery and .88 to .95 for the school-age
creased cognitive differentiation from one general factor to battery. These score reliabilities tend to be fully adequate.
two preschool clusters to three school-age clusters is consistent The inclusion of psychometric statistics on out-of-level
422 Assessment of Intellectual Functioning
subtests (i.e., items that are not normally administered to per- the DAS has also been found to be factorially stable across
sons of a given age but that may be appropriate for individu- racial and ethnic groups (Keith, Quirk, Schartzer, & Elliott,
als functioning at an ability level much lower than that 1999).
typically expected for their age) provides examiners with
additional flexibility, especially for older children with sig- Interpretive Indexes and Applications
nificant delay or impairment. Stability coefficients were com-
The DAS involves some score transformation based upon
puted for examinees in three age groups undergoing test and
item response theory. Raw scores are converted first to latent
retest intervals of 2–7 weeks, with correction for restriction
trait ability scores, which are in turn translated into T scores
of range on the initial score. Across these groups, subtest test-
and percentiles. T scores may be summed to produce the GCA
retest reliabilities ranged from .38 to .89, whereas cluster and
and cluster scores (M = 100, SD = 15). The GCA is a com-
GCA reliabilities ranged from .79 to .94. These results indi-
posite score derived only from subtests with high g loadings,
cate that composite scores tend to be adequately stable, but
and cluster scores consist of subtests that tend to factor to-
subtests at specific ages may not be particularly stable. Four
gether. The diagnostic subtests measure relatively independent
subtests with open-ended scorable responses all have inter-
abilities. The clusters and diagnostic subtests have adequate
rater reliabilities greater than or equal to .90, which falls
specific variance to support their interpretation independent
within an acceptable range.
from g. Table 18.2 contains the basic composite indexes, with
DAS floors are sufficiently low so that it can be used with
subtests excluded.
3-year-old children with developmental delays. Use of IRT
scaling extrapolation also permits GCA norms to be extended
downward to a standard score of 25, enhancing the discrim- TABLE 18.2 Differential Ability Scales Cognitive Battery Composite
inability of the DAS with individuals with moderate to severe Indexes (S. N. Elliott, 1990)
mental retardation. Test items are also considered by review- Composite Indexes Description
ers to be appealing, engaging, and conducive to maintaining General Conceptual Ability Ability to perform complex mental
high interest in younger children (e.g., Aylward, 1992). Initial (GCA) processing that involves conceptualization
and transformation of information.
investigations with the DAS also suggest that it has promise
Ages 3 years, 6 months through 5 years
in accurate identification and discrimination of at-risk pre- Verbal Ability Acquired verbal concepts and knowledge.
schoolers (McIntosh, 1999), sometimes a challenging group Nonverbal Ability Complex nonverbal mental processing,
to assess because of test floor limitations. including spatial perception, nonverbal
The DAS tends to show strong convergence with other reasoning ability, perceptual-motor skills,
and the understanding of simple verbal
intelligence tests. According to analyses from C. D. Elliott
instructions and visual cues.
(1990b), the DAS GCA correlates highly with composite in-
Ages 6 years through 17 years
dexes from the WPPSI-R (r = .89 for the preschool battery), Verbal Ability Complex verbal mental processing,
WISC-III (r = .92 for the school-age battery; from Wechsler, including acquired verbal concepts,
1991), Stanford-Binet (.77 preschool; .88 school-age), and verbal knowledge, and reasoning; involves
knowledge of words, verbal concepts, and
K-ABC (.68 preschool; .75 school-age). general information; also involves
Exploratory and confirmatory factor analyses of the DAS expressive language ability and long-term
provide general support for the structure of the test (C. D. semantic memory.
Elliott, 1990b). In separate confirmatory reanalyses, Keith Nonverbal Reasoning Ability Nonverbal inductive reasoning and complex
mental processing; inductive reasoning,
(1990) reported a structure that is generally consistent with including an ability to identify the rules that
the structure reported in the DAS handbook. He found sup- govern features or variables in abstract
port for a hierarchical structure with superordinate g and sev- visual problems and an ability to formulate
and test hypotheses; understanding of
eral second-order factors (including the diagnostic subtests)
simple verbal instructions and visual cues;
that generally correspond to the test’s structure. The nonver- and use of verbal mediation strategies.
bal reasoning ability cluster had the strongest relationship to Spatial Ability Complex visual-spatial processing; ability in
g for school-age children, whereas the early number concepts spatial imagery and visualization, perception
subtest had the strongest relationship to g for preschool chil- of spatial orientation (the preservation of
relative position, size, and angles in different
dren. These analyses may be interpreted as consistent with aspects of the design), analytic thinking (the
other bodies of research (e.g., Carroll, 1993; Gustafsson, separation of the whole into its component
1984, 1988; Undheim, 1981) suggesting that reasoning ability parts), and attention to visual detail.
is largely synonymous with g. In additional investigations, Note. Interpretive indexes are adapted from Sattler (1988).
Descriptions of the Major Intelligence Tests 423
Test reviews in the Mental Measurements Yearbook have contacted the next day by the director of test development at
lauded the technical psychometric quality of the DAS as American Guidance Service (AGS), who asked if they were
well as its utility with preschool children; they are critical, interested in developing an intelligence test to challenge the
however, of selected aspects of administration and scoring. Wechsler scales. At the University of Georgia, Alan and
Aylward (1992) noted its utility with delayed or impaired Nadeen worked with a gifted group of graduate students on
young children: “The combination of developmental and edu- the K-ABC. Among their students were Bruce Bracken, Jack
cational perspectives makes the DAS unique and particularly Cummings, Patti Harrison, Randy Kamphaus, Jack Naglieri,
useful in the evaluation of young (3.5–6 years) children sus- and Cecil Reynolds, all influential school psychologists and
pected of having developmental delays, children with hearing test authors.
or language difficulties, or school-age students with LDs
[learning disabilities] or mild mental retardation” (p. 282). At Theoretical Underpinnings
the same time, Reinehr (1992) expressed concern about its ad-
ministration and scoring complexity. For example, the Recall The K-ABC was developed to assess Luria’s (1980) neuropsy-
of Digits subtest involves presentation of digits at a rate dif- chological model of sequential and simultaneous cognitive
ferent from the conventional rate in psychological testing, and processing. As conceptualized by the Kaufmans, sequential
raw scores must be transformed from raw scores to IRT-based operations emphasize the processing of stimuli events in se-
ability scores before undergoing transformation to norm- quential or serial order, based upon their temporal relationship
referenced standard scores. Practice administration and newly to preceding and successive stimuli. Language, for instance, is
available computer-scoring software may help to address inherently sequential because one word is presented after an-
some of these concerns. other in everyday communications. Simultaneous operations
refer to the processing, integration, and interrelationship of
multiple stimuli events at the same time. Spatial perception
Kaufman Assessment Battery for Children
lends itself to simultaneous processing, for example, because
Alan S. Kaufman and Nadeen L. Kaufman are married it requires that various figural elements be organized into a
coauthors of the Kaufman Assessment Battery for Children single perceptual whole. The sequential-simultaneous di-
(K-ABC; Kaufman & Kaufman, 1983a, 1983b) and Kaufman chotomy represents two distinctive forms of novel informa-
Adolescent and Adult Intelligence Test (KAIT; Kaufman & tion processing. Factual knowledge and acquired skills are
Kaufman, 1993). They have a unique training and academic measured separately, in an Achievement scale that is separate
lineage and have in turn exerted strong influences on several from the two mental processing scales.
leading test developers. Their history here is summarized In what the Kaufmans have described as a theoretical rerout-
from the Kaufmans’ own telling, as provided to Cohen and ing, the KAIT was based primarily on the Cattell and Horn
Swerdlik (1999). Alan Kaufman completed his doctorate distinction between fluid and crystallized intelligence and was
from Columbia University under Robert L. Thorndike, who developed to serve the ages 11–85 + years. The K-ABC and
would head the restandardization of the Stanford-Binet L-M KAIT models may be reconciled if mental processing is con-
(Terman & Merrill, 1973) and serve as senior author of the sidered roughly equivalent to fluid intelligence (reasoning) and
Stanford-Binet Fourth Edition (R. L. Thorndike, Hagen, & achievement is treated as analogous to crystallized intelligence
Sattler, 1986). Kaufman was employed at the Psychological (knowledge; Cohen & Swerdlik, 1999). Fluid intelligence
Corporation from 1968 to 1974, where he worked closely refers to forms of analysis that only minimally rely upon recall
with David Wechsler on the WISC-R. Nadeen Kaufman of knowledge and well-learned skills to draw conclusions,
completed her doctorate in special education with an empha- reach solutions, and solve problems. Reasoning is considered
sis in neurosciences from Columbia University, where she to be fluid when it takes different forms or utilizes different
acquired a humanistic, intra-individual developmental ap- cognitive skills according to the demands of the situation.
proach to psychological assessment and learning disabilities Cattell and Horn (1976) describe crystallized intelligence as
that would blend uniquely with her husband’s approach. Fol- representing a coalescence or organization of prior knowledge
lowing his departure from the Psychological Corporation, and educational experience into functional cognitive systems
Alan Kaufman joined the Educational and School Psychol- to aid further learning in future educational situations. Crystal-
ogy Department at the University of Georgia. According to lized intelligence is dependent upon previously learned knowl-
the Kaufmans, the K-ABC was conceptualized and a blue- edge and skills, as well as on forms of knowledge that are
print developed on a 2-hour car trip with their children in culturally and linguistically based. The Stanford-Binet and
March of 1978. In a remarkable coincidence, they were Woodcock-Johnson III tests of cognitive abilities represent the
424 Assessment of Intellectual Functioning
most frequently used measures based upon a fluid and crystal- developmentally advanced. At the same time, the K-ABC has
lized model of intelligence. As the most well researched and some problems with floors because subtests do not consis-
popular of the Kaufman cognitive-intellectual measures, the tently extend 2 SDs below the normative mean until age 6.
K-ABC is now described. Subtest ceilings do not consistently extend 2 SDs above the
normative mean above age 10.
Standardization Features and Psychometric Adequacy
Interpretive Indexes and Applications
The K-ABC underwent national standardization in 1981, and
norms are based on a sample of 2,000 children between the The K-ABC consists of 10 processing subtests, each with a
ages of 2.5–12.5 years. The sample was collected to be repre- normative mean of 10 and standard deviation of 3, intended
sentative according to 1980 U.S. census figures, based upon for specific age ranges between 2.5 and 12.5 years. Six addi-
the stratification variables of sex, race-ethnicity, geographic tional subtests are included to test academic achievement. The
region, community size, parental education, and educational K-ABC yields four global processing scales: Mental Process-
placement. Ages were sampled at n = 200 per 12-month ing Composite, Sequential Processing, Simultaneous Pro-
interval. Exceptional children were included in the K-ABC cessing, and Nonverbal, all with a mean of 100 and standard
sample. The sample tends to be fairly representative of cen- deviation of 15. Table 18.3 includes core interpretations for
sus expectations at the time of standardization, although the K-ABC global processing scales. Kaufman and Kaufman
African American and Hispanic minorities tended to have (1983b) recommend a step-by-step approach to interpretation
higher parent education levels than expected according to and hypothesis generation, beginning with interpretation of
census figures (Bracken, 1985). The use of minorities of high mental processing and achievement composites and proceed-
socioeconomic status (SES) may explain the small African ing through individual subtest strengths and weaknesses. An
American-White and Hispanic-White group mean score dif- emphasis is based upon subtest profile analysis, in which sub-
ferences reported for the K-ABC (Kaufman & Kaufman, test performance is compared with an examinee’s own subtest
1983b). mean in order to identify relative strengths and weaknesses.
The reliabilities of the K-ABC were computed with a Rasch A number of profile patterns are described to explain achieve-
adaptation of the split-half method. Person-ability estimates ment performance based upon the Lurian model.
were computed from each of the odd and even item sets and At the time of its publication, the K-ABC was perceived
correlated, with correction for length by the Spearman-Brown as innovative and progressive, holding considerable promise
formula. For preschool children, the mean subtest reliability for changing fundamental aspects of intellectual assessment.
coefficients range from .72 to .89 (Mdn = .78); for school- To a limited extent, this promise has been realized because
age children, they range from .71 to .85 (Mdn = .81). These many K-ABC features (e.g., easel-based test administration)
score reliabilities approach the lower bounds of acceptability. have become standard for intelligence testing. In a thoughtful
The K-ABC mean composite scale reliabilities range from .86 review of the impact of the K-ABC, Kline, Snyder, and
to .94, with a mean Mental Processing Composite coefficient
of .91 for preschool children and .94 for school-age children. TABLE 18.3 Kaufman Assessment Battery for Children
Test-retest stability over an interval of 2–4 weeks (M = Composite Indexes Description
18 days) yielded a median Mental Processing Composite reli- Mental Processing An aggregate index of information-processing
ability of .88, median processing scale reliabilities of .85, and Composite (MPC) proficiency; intended to emphasize problem-
median subtest reliabilities of .76. Stability coefficients tend to solving rather than acquired knowledge and
skills.
be smaller for preschool than for school-aged children.
Sequential Processing An index of proficiency at processing stimuli
The K-ABC offers several unique developmental features,
in sequential or serial order, where each
coupled with floor and ceiling limitations. The test consists of stimulus is linearly or temporally related to
developmentally appropriate subtests at specific ages, and the previous one.
several subtests are introduced at age 5. By contrast, the Simultaneous Processing An index of proficiency at processing stimuli
Wechsler scales and the WJ III have similar subtest proce- all at once, in an integrated manner interre-
lating each element into a perceptual whole.
dures across the entire life span. The K-ABC also permits out-
Nonverbal A broad index of cognitive processing based
of-level testing, so that tests intended for 4-year-olds may be upon K-ABC subtests that are appropriate
given to older children with mental retardation or develop- for use with children who are deaf, have
mental delays, whereas tests intended for older children communication disorders, or have limited
English proficiency.
may be given to 4-year-olds who are thought to be gifted of
Descriptions of the Major Intelligence Tests 425
Castellanos (1996) noted its laudable intentions to formulate group-administered Cognitive Abilities Test (CogAT);
a test based upon a coherent theory, using novel measurement specifically, he sought to stratify the Stanford-Binet sample
paradigms to assess cognitive skills that are considered di- to proportionately represent all ability levels based upon per-
rectly relevant to school achievement. The main problems formance on the Verbal CogAT Battery. No effort was made
with the K-ABC, according to Kline and colleagues (1996), to stratify the sample on demographic variables such as race,
include the degree to which its subtests may be interpreted as ethnicity, or SES, although the sample ultimately was more
tapping constructs other than those intended (e.g., sequential inclusive and diverse than that used with any previous
processing subtests may be seen as measures of short-term Stanford-Binet edition.
memory, and simultaneous processing subtests as measures The Stanford-Binet–Fourth Edition (SB4) was published
of spatial cognition) and its failure to adequately support its in 1986, authored by Robert L. Thorndike, Elizabeth P.
remedial model. It is noted, however, that the K-ABC and Hagen, and Jerome M. Sattler. The SB4 covers the age range
CAS are the only major intelligence tests to even make a for- of 2 through 23 years and offers several significant departures
mal attempt to link cognitive assessment to remediation. from its predecessors, most notably offering a point-scale for-
mat instead of Form L-M’s age-scale format and offering
Stanford-Binet Intelligence Scale factor-based composite scores, whereas Form L-M only
yielded a composite intelligence score. The SB4 was the first
The oldest line of intelligence tests is the Stanford-Binet major intelligence test to include use of IRT in building scales
Intelligence Scale, now in its fourth edition (SB4; R. L. and differential item functioning to minimize item bias.
Thorndike et al., 1986), with a redesigned fifth edition un- Attempts were made to preserve many of the classic proce-
dergoing standardization at the time of this writing. The dures (e.g., picture absurdities) that were prominent in prior
Stanford-Binet has a distinguished lineage, having been the editions. In spite of these efforts, the SB4 was poorly exe-
only one of several adaptations of Binet’s 1911 scales to sur- cuted, receiving considerable criticism for problems with the
vive to the present time. According to Théodore Simon (cited makeup of its standardization sample, delays in producing
in Wolf, 1973, p. 35), Binet gave Lewis M. Terman at test norms and a technical manual, and dissent among the au-
Stanford University the rights to publish an American revi- thors that led to the development of several different factor-
sion of the Binet-Simon scale “for a token of one dollar.” scoring procedures.
Terman (1877–1956) may arguably be considered the person
most responsible for spawning the large-scale testing indus-
Theoretical Underpinnings
try that develops educational tests for many states. He was a
leading author and advocate for Riverside Press and the Binet’s tests are best remembered for innovative diversity and
World Book Company, as well as a founding vice president at their emphasis upon a common factor of judgment, which may
the Psychological Corporation (Sokal, 1981). Terman’s be considered similar to Spearman’s g factor. In understanding
(1916) adaptation of the Binet-Simon Scales was followed by intelligence, Terman placed an emphasis upon abstract and
his collaboration with Maud A. Merrill beginning in 1926 to conceptual thinking over other types of mental processes. In
produce two parallel forms (Forms L for Lewis and M for the famous 1921 symposium on intelligence, he asserted that
Maud) published in 1937. The 1937 edition of the Stanford- the important intellectual differences among people are “in the
Binet had remarkable breadth, developmentally appropriate capacity to form concepts to relate in diverse ways, and to grasp
procedures, and a highly varied administration pace so that their significance. An individual is intelligent in proportion as
examinees performed many different kinds of activities. It he is able to carry on abstract thinking. . . . Many criticisms of
may have constituted an early high point for intelligence test- the current methods of testing intelligence rest plainly on a psy-
ing. McNemar’s (1942) analyses of the standardization data chology which fails to distinguish the levels of intellectual
included the creation of nonverbal scales and memory scales, functioning or to assign to conceptual thinking the place that
none of which were implemented in the 1960 edition. Terman belongs to it in the hierarchy of intelligences” (pp. 128–129).
and Merrill merged the best items of each form into Form Terman further asserted that measures of abstract thinking
L-M in 1960. A normative update and restandardization of using language or other symbols are most strongly associated
Form L-M, with only minor content changes, was conducted with educational success, arguing that the Stanford-Binet
from 1971 to 1972 under the direction of Robert L. contained as many of these types of tasks as was practical.
Thorndike. Thorndike’s norming approach was unusual be- The SB4 sought to recast many of the Stanford-Binet’s
cause he sampled from the 20,000 student participants (and classical tests in terms of Cattell and Horn’s fluid-crystallized
their siblings) who had participated in the norming of his model of cognitive abilities, thereby “to wed theory with
426 Assessment of Intellectual Functioning
measurement practice” (R. M. Thorndike & Lohman, 1990, a median stability coefficient of .81 (quantitative reasoning
p. 125). The SB4 was conceptualized to measure a hierarchi- has markedly lower stability than do the other area scores),
cally organized model of intelligence. General ability or g is and the subtests have a median stability coefficient of .70.
at the apex of this model, and is interpreted “as consisting of Given the longer-than-typical test-retest interval, these retests
the cognitive assembly and control processes that an individ- tend to be adequate with the possible exception of the quanti-
ual uses to organize adaptive strategies for solving novel tative reasoning subtests.
problems” (R. L. Thorndike et al., 1986, p. 3). Three broad Stanford-Binet floors and ceilings also tend to be adequate.
group factors—crystallized abilities, fluid-analytic abilities, Test score floors extend two or more standard deviations below
and short-term memory—constitute the second level. First- the normative mean beginning with age 4, indicating that
order factors in the four broad areas of cognitive ability younger children with cognitive delays may show floor effects.
tapped by the SB4 are represented at the base of the model Subtest ceilings consistently extend two or more standard devi-
with crystallized abilities represented by both verbal and ations above the normative mean up through age 10, so older
quantitative reasoning tasks. Accordingly, the SB4 may be children who are intellectually gifted may show ceiling effects.
considered the first contemporary test to operationalize the The overall composite SAS ranges from – 4 SD to + 4SD
fluid and crystallized model of intelligence. acrossages.
The factor structure of the Stanford-Binet has yielded
Standardization Features and Psychometric Adequacy several separate solutions (all of which are scored in the test
software), marked by Sattler’s dissension from his coauthors
The Stanford-Binet was standardized in 1985 on 5,013 chil- and publication of new factor analyses and score computa-
dren, adolescents, and adults in 17 age groups. Age groups tions (Sattler, 1988). Robert M. Thorndike, son of the senior
were generally represented by 200 to 300 participants, al- author, sought to resolve the divergent solutions in a 1990
though the numbers dip below 200 for older adolescents. The study. In brief, he concluded that from ages 2 through 6,
sample was selected to be representative of 1980 U.S. census a two-factor solution representing primarily verbal ability
figures. Stratification variables included sex, ethnicity, geo- (defined by vocabulary, comprehension, absurdities, and
graphic region, and community size, with parent educational memory for sentences) and nonverbal ability (defined by pat-
and occupational levels serving as proxies for SES. The final tern analysis, copying, quantitative, and bead memory) was
sample was weighted to adjust for inadequate representation most defensible. From ages 7 through 11, a three-factor solu-
of children from low-SES backgrounds, thereby introducing tion including verbal ability (defined by vocabulary, compre-
potential sampling error into the standardization sample. Strat- hension, and memory for sentences), abstract-visual ability
ification accuracy is reported in the Stanford-Binet on the mar- (defined by pattern analysis, copying, matrices, bead memory,
gins, so that for example, the percent of the sample classified and absurdities), and memory (defined primarily by memory
in varied racial-ethnic groups is reported in isolation with- for digits and memory for objects, although memory for sen-
out concurrent information about socioeconomic composi- tences has a secondary load here) was supported. From ages
tion. The unpredictable consequences of sample weighting on 12 through 23 years, three factors were also supported: verbal
proportionate representation in specific sampling cells cannot ability (vocabulary, comprehension, memory for sentences),
be formally assessed when stratification by several variables is abstract-visual ability (pattern analysis, matrices, paper fold-
not fully reported. Accordingly, it is difficult to evaluate the ing and cutting, number series, equation building, and to a
representativeness of the Stanford-Binet sample. lesser extent bead memory), and memory (memory for digits,
The reliabilities of the Stanford-Binet scores are fully ade- memory for sentences, and memory for objects). Thorndike
quate. Computed with the Kuder-Richardson Formula 20, the was unable to extract a quantitative factor. These results are
composite standard age score internal consistency reliability generally consistent with those offered by Sattler (1988) and
ranges from .95 to .99 across age groups. Lower bound relia- suggest that the Stanford-Binet quantitative reasoning stan-
bility estimates for the area scores (based on the two-subtest dard age scores should be interpreted with caution. The
versions of these composites) are at or above .90 for the verbal Stanford-Binet permits substantial flexibility in choosing the
reasoning area, the abstract-visual reasoning area, and the number and identity of subtests contributing to a composite,
quantitative area, but they are at .89 for the short-term memory but the degree to which subtests are interchangeable (i.e., ap-
area. Median subtest reliabilities range from .73 to .94. Test- propriately substituted for one another) is questionable and
retest reliability over an interval ranging from 2–8 months should be based upon the factor analytic findings described
(M = 16 weeks) appears to be generally adequate; the com- previously. In a comprehensive review of the factor analytic
posite standard age score (SAS) is at .90, the area scores have studies of the Stanford-Binet, Laurent, Swerdlik, and Ryburn
Descriptions of the Major Intelligence Tests 427
(1992) concluded that the analyses by Sattler and R. M. TABLE 18.4 Stanford-Binet Intelligence Scale–Fourth Edition
(R. L. Thorndike, Hagen, & Sattler, 1986)
Thorndike were essentially correct.
The Stanford-Binet composite SAS generally correlates Composite/Factor Indexes Description
highly with the Wechsler scales (r = .80–.91 across the Composite Standard Age Score A global estimate of cognitive ability.
WPPSI, WISC, and WAIS) and r = .74–.89 with the K-ABC, Ages 2 years through 6 years
according to a review from Kamphaus, 1993). The Stanford- Verbal Comprehension Ability Depth and breadth of accumulated
experience and repertoire of verbal
Binet, however, is the only one of the major intelligence tests knowledge.
not systematically linked to an achievement test for identifi- Nonverbal Reasoning and Nonverbal, fluid problem-solving
cation of ability-achievement discrepancies. According to the Visualization Ability abilities, particularly when stimuli are
SB4 technical manual, the correlation between the composite presented visually and involve motor,
pointing, or verbal responses.
SAS and the K-ABC Achievement Scale was .74.
Ages 7 years through 23 years
Verbal Comprehension Ability Complex verbal mental processing,
Interpretive Indexes and Applications including acquired verbal concepts,
verbal knowledge, and reasoning;
involves knowledge of words, verbal
The Stanford-Binet consists of 15 point-scale tests, in contrast
concepts, and general information; also
with the developmental age scale utilized for Form L-M. The involves expressive language ability
vocabulary test is used with chronological age to locate and long-term semantic memory.
the starting point for each test. Each of the tests has a norma- Nonverbal Reasoning and Nonverbal, fluid problem-solving
tive T = 50 and SD of 8. Four broad areas of cognitive Visualization Ability abilities, particularly when stimuli are
presented visually and involve motor,
abilities are assessed—Verbal Reasoning, Abstract-Visual pointing, or verbal responses.
Reasoning, Quantitative Reasoning, and Short-Term Mem- Memory Ability Short-term memory abilities; involves
ory. SAS composites are all set at a mean of 100 and SD of 16. the abilities to attend, encode, use
As discussed previously, the factor studies reported by Sattler rehearsal strategies, shift mental
operations rapidly, and self-monitor.
(1988) and R. M. Thorndike (1990) provide more support for
the interpretation of the factor scores than for the four broad
area scores, so the use of the factors is recommended for in-
terpretive purposes: verbal comprehension ability, nonverbal (e.g., use of IRT and differential item function studies). The
reasoning-visualization ability, and memory ability. The first major weaknesses of the SB4 involved the boldness of its
two factors should be interpreted for children aged 2–6, but break with its own tradition and its poor technical execution—
all three factors should be interpreted for ages 7–23. Funda- particularly in the representativeness of its normative sample
mental interpretation of these composite and factor scores and the problems with its disputed factor structure. Cronbach
appears in Table 18.4. The overall IQ score is termed the (1989) questioned the factor structure and asked, “How use-
Composite Standard Age Score in an attempt to avoid some of ful is the profile?” (p. 774). Anastasi (1989) noted that the
the connotations of the term IQ. All of the Stanford-Binet sub- Stanford-Binet’s “principal limitation centers on communica-
tests are either good or fair measures of g (Sattler, 1988). tions with test users, especially in clinical settings” (p. 772).
Although the SB4 does not appear to have the proficiency
of its predecessors in identifying intellectually gifted children,
Wechsler Intelligence Scales
it has been shown to have utility in facilitating the identifica-
tion of mentally retarded and neurologically impaired chil- No brand name in psychology is better known than Wechsler,
dren (Laurent et al., 1992). The Examiner’s Handbook and now applied to a series of four intelligence scales spanning the
Inferred Abilities and Influences Chart (Delaney & Hopkins, ages 3 through 89, an adult memory scale covering ages 16
1987) provide additional guidelines for administration and in- through 89, and an achievement test covering ages 4 through
terpretive depth; they also describe appropriate combinations adult. The remarkable success of the Wechsler measures is at-
of tests to use with special populations. tributable to David Wechsler (1896–1981), a gifted clinician
The SB4 blended “old tasks and new theory” (R. M. and psychometrician with a well-developed sense of what was
Thorndike & Lohman, 1990, p. 126) to create a much-needed practical and clinically relevant. Decades after Wechsler’s
revision to the older L-M edition. It offered factor scores, an death, his tests continue to dominate intellectual assessment
easy easel-based administration format, a flexible and versa- among psychologists (Camera, Nathan, & Puente, 2000).
tile administration format, and better psychometric properties Indeed, even the achievement test bearing his name (but that
than the L-M’s. It included numerous technical innovations he did not develop) has become a market leader.
428 Assessment of Intellectual Functioning
Wechsler’s role in the history of intelligence assessment ability: “My definition of intelligence is that it’s not equiva-
has yet to be formally assessed by historians, but the origins lent to any single ability, it’s a global capacity. . . . The tests
of his tests and interpretive approaches can easily be traced to themselves are only modes of communication” (Wechsler,
his early educational and professional experiences. Wechsler 1976, p. 55). Although he was at Columbia University when
was introduced to most of the procedures that would eventu- the Spearman-Thorndike-Thomson debates on g were occur-
ally find a home in his intelligence and memory scales as a ring in the professional journals, he was sufficiently taken
graduate student at Columbia University (with faculty includ- with Spearman’s work to later (unsuccessfully) attempt the
ing J. McKeen Cattell, Edward L. Thorndike, and Robert S. identification of a parallel general emotional factor (Wechsler,
Woodworth), as an assistant for a brief time to Arthur Otis at 1925). Wechsler’s friendship with and loyalty to Spearman
the World Book Company in the development in the first never permitted him to break with g theory, and in 1939 he
group intelligence test, and as an army psychological exam- wrote that Spearman’s theory and its proofs constitute “one of
iner in World War I. As part of a student detachment from the the great discoveries of psychology” (p. 6).
military, Wechsler attended the University of London in Wechsler did not believe that division of his intelligence
1919, where he spent some 3 months working with Charles E. scales into verbal and performance subtests tapped separate
Spearman. From 1925 to 1927, he would work for the Psy- dimensions of intelligence; rather, he felt that this dichotomy
chological Corporation in New York, conducting research was diagnostically useful (e.g., Wechsler, 1967). In essence,
and developing tests such as his first entitled Tests for Taxicab the verbal and performance scales constituted different ways
Drivers. Finally, Wechsler sought clinical training from sev- to assess g. Late in his life, Wechsler described the verbal
eral of the leading clinicians of his day, including Augusta F. and performance tests merely as ways to converse with a
Bronner and William Healy at the Judge Baker Foundation person—that is, “to appraise a person in as many different
in Boston and Anna Freud at the Vienna Psychoanalytic modalities as possible” (Wechsler, 1976, p. 55). Wechsler’s
Institute (for 3 months in 1932). By virtue of his education scales sought to capitalize on preferences of practitioners to
and training, Wechsler should properly be remembered as one administer both verbal and performance scales by packaging
of the first scientist-clinicians in psychology. both in a single conformed test battery (a combination previ-
Wechsler originally introduced the Bellevue Intelligence ously attempted by Rudolf Pintner, who was responsible,
Tests in 1939 (Wechsler, 1939), followed by the Wechsler with Donald G. Paterson, for the one of the most popular
Intelligence Scale for Children (WISC; Wechsler, 1949), the early performance scales). Wechsler found belatedly that
Wechsler Adult Intelligence Scale (WAIS; Wechsler, 1955), after they were published, his tests were valued more for their
and the Wechsler Preschool and Primary Scale of Intelligence verbal-performance dichotomy than for their diverse mea-
(WPPSI; Wechsler, 1967). With some variation, these tests sures of g:
all use the same core set of subtests and interpretive scores.
The most recent editions of Wechsler’s tests are the WISC-III It was not until the publication of the Bellevue Scales that any
(Third Edition; Wechsler, 1991), the WAIS-III (Wechsler, consistent attempt was made to integrate performance and verbal
tests into a single measure of intelligence test. The Bellevue tests
1997), and a short form named the Wechsler Abbreviated
have had increasingly wider use, but I regret that their popularity
Scale of Intelligence (WASI; Wechsler, 1999). The WASI
seems to derive, not from the fact that they make possible a sin-
uses the Wechsler Vocabulary, Similarities, Block Design, gle global rating, but because they enable the examiner to obtain
and Matrix Reasoning subtests. separate verbal and performance I.Q.’s with one test. (Wechsler,
1950/1974, p. 42)
Theoretical Underpinnings
Wechsler was clearly aware of multifactor theories of
The Wechsler intelligence scales are decidedly atheoretical human ability. He placed relatively little emphasis upon
(beyond their emphasis on g), and in recent years they have multifactor ability models in his tests, however, because after
exemplified a test in search of a theory. As originally concep- the contribution of the general factor of intelligence was
tualized by David Wechsler (1939), they were clearly in- removed, the group factors (e.g., verbal, spatial, memory) ac-
tended to tap Spearman’s general intelligence factor, g: “The counted for little variance in performance (e.g., Wechsler,
only thing we can ask of an intelligence scale is that it mea- 1961). Wechsler also rejected the separation of abilities be-
sures sufficient portions of intelligence to enable us to use it as cause he saw intelligence as resulting from the collective inte-
a fairly reliable index of the individual’s global capacity” gration and connectivity of separate neural functions. He
(p. 11). Wechsler purposefully included a diverse range of believed that intelligence would never be localized in the
tasks to avoid placing disproportionate emphasis on any one brain and observed, “While intellectual abilities can be shown
Descriptions of the Major Intelligence Tests 429
to contain several independent factors, intelligence cannot be from the WAIS-III sample in an effort to enhance the clinical
so broken up” (Wechsler, 1958, p. 23). sensitivity of the measure.
Following Wechsler’s death in 1981, the test publisher has Internal consistency tends to be adequate for the Wechsler
slowly but inexorably gravitated toward a multifactor interpre- scales, although there are some isolated subtests with prob-
tive model—expanding coverage to four factors in the 1991 lems. Composite scores (FSIQ; Verbal IQ, VIQ; Performance
WISC-III (verbal-comprehension, perceptual-organization, IQ, PIQ; Verbal Comprehension Index, VCI; Perceptual
freedom from distractibility, and processing speed) and Organization Index, POI; and Freedom From Distractibility
four factors in the 1997 WAIS-III (verbal-comprehension, Index and Working Memory Index, FDI-WMI) tend to yield
perceptual-organization, working memory, and processing average rs > .90 for the WISC-III and WAIS-III, although
speed). The WISC-III featured a new subtest called Symbol the FDI tends to be slightly lower. Test-retest stability coeffi-
Search to tap processing speed, and the WAIS-III added cients are reported instead of internal consistency for the PSI.
Matrix Reasoning to enhance the measurement of fluid rea- At the WISC-III subtest level, Arithmetic, Comprehension,
soning and added Letter-Number Sequencing as a measure of and all performance subtests (with the exception of Block
working memory (The Psychological Corporation, 1997). Design) have average reliabilities below .80. At the WAIS-III
There is a piecemeal quality to these changes in the Wechsler subtest level, only Picture Arrangement, Symbol Search, and
scales, guided less by a coherent approach than by a post hoc Object Assembly have average reliability coefficients below
effort to impose theory upon existing Wechsler subtests. The .80, and Object Assembly in particular appears to decline in
theoretical directions to be charted for the Wechsler scales re- measurement precision after about age 70. Accordingly, the
main to be clearly articulated, but the words of Cronbach Wechsler scales show measurement precision slightly less
(1949) in describing the Wechsler scales remain salient: “One than considered optimal for their intended decision-making
can point to numerous shortcomings. Most of these arise from applications.
Wechsler’s emphasis on clinical utility rather than upon any Test-retest reliability tends to be adequate for WISC-III and
theory of mental measurement” (p. 158). the WAIS-III composite indexes and verbal scale subtests,
although some performance subtests have less-than-optimal
Standardization Features and Psychometric Adequacy stability. For six age groups undergoing serial testing with test-
retest intervals ranging from 12 to 63 days (Mdn = 23 days),
The Wechsler scales are renowned for their rigorous standard- the WISC-III yielded a mean corrected stability coefficient of
izations, and their revisions with normative updates are now .94 for FSIQ and in the .80s and .90s for composite scores, with
occurring about every 15 years, apparently in response to the the exception of a low FDI corrected stability coefficient of
Flynn effect (see the chapter by Wasserman & Bracken in this .74 for 6- to 7-year-old children. Corrected reliability coeffi-
volume). The Wechsler scales tend to utilize a demographi- cients for individual subtests ranged from a low of .54–.62
cally stratified (and quasi-random) sampling approach, col- for Mazes to a high of .82–.93 for Vocabulary. Four subtests
lecting a sample at most age levels of about n = 200 divided (Vocabulary, Information, Similarities, and Picture Comple-
equally by sex. Larger sample sizes are most important during tion) have an average corrected stability coefficient above .80
ages undergoing changes such as the rapid cognitive devel- (Wechsler, 1991). Over an interval ranging from 2 to 12 weeks
opment in young school-aged children and the deterioration (M = 34.6 days) across four age groups, the WAIS-III FSIQ
in older individuals. Unfortunately, the WAIS-III sample re- has a mean stability coefficient of .96 corrected for the vari-
duces its sample size requirements (to n = 150 and n = 100) ability of scores in the standardization sample. Mean corrected
at the two age levels between 80 and 90, although these indi- stability coefficients for the WAIS-III subtests range from the
viduals by virtue of their deterioration actually merit an in- .90s for Vocabulary and Information to the .60s and .70s for
creased sample size. Stratification targets are based on the Picture Arrangement and Picture Completion. Composite in-
most contemporary census figures for race-ethnicity, educa- dexes all have corrected stability coefficients in the .80s and
tional level (or parent educational level for children), and geo- .90s (The Psychological Corporation, 1997).
graphic region. The manuals for the Wechsler scales typically The four-factor structure of the WISC-III and the WAIS-III,
report demographic characteristics of the standardization corresponding to the four interpretive indexes, have been
sample across stratification variables, so it is possible to ascer- found to be largely resilient across a variety of samples. The
tain that characteristics were accurately and proportionally WISC-III has been reported to be factorially invariant across
distributed across groups rather than concentrated in a single age (Keith & Witta, 1997), racial groups (Kush et al., 2001),
group. Individuals with sensory deficits or known or sus- deaf and hearing samples (Maller & Ferron, 1997), and
pected neurological or psychiatric disorders were excluded Canadian and British samples (Cooper, 1995; Roid & Worrall,
430 Assessment of Intellectual Functioning
1997). Among clinical and exceptional groups, the factor standard scores (and meaningful discrepancies) if there is rel-
structure is consistent across samples of children in special ed- atively little scatter in the factor-based index scores, (c) inter-
ucation (Grice, Krohn, & Logerquist, 1999; Konold, Kush, & preting the four factor-based index scores (and meaningful
Canivez, 1997), children with psychiatric diagnoses (Tupa, discrepancies) when there is relatively little scatter in each
Wright, & Fristad, 1997), and children with traumatic brain in- one’s constituent subtests, (d) interpreting scores at the
jury (Donders & Warschausky, 1997). The WAIS-III has been subtest level if there is sufficient evidence to support the in-
found to be factorially stable across the United States and terpretation of unique and specific variance, and (e) interpret-
Canada (Saklofske, Hildebrand, & Gorsuch, 2000) and across ing responses, errors, and strategies on individual items
mixed psychiatric and neurologically impaired samples (Ryan that are clinically relevant and normatively unusual. The
& Paolo, 2001). composite and factor-based indexes for the WISC-III and
WISC-III and WAIS-III subtest floors and ceilings tend to WAIS-III appear in Table 18.5, with our own descriptions
be good, spanning at least ± 2 SDs at every age and usually appended.
larger. The lowest possible FSIQ yielded by the WISC-III is After interpretation of the FSIQ, the most common score
40, and the highest possible FSIQ is 160. The WAIS-III has interpreted on the Wechsler scales is the discrepancy between
slightly less range, with FSIQs from 45 to 155. Ceilings on the verbal and performance IQs. Leading interpretations of the
several of the performance subtests are obtained through the discrepancies are presented in Alan Kaufman’s books on the
use of bonus points for speed. Perhaps one of the central Wechsler scales (Kaufman, 1994; Kaufman & Lichtenberger,
weaknesses of the Wechsler scales is that most performance 1999, 2000). Logical comparisons between clusters of sub-
tests are timed. Although measuring speed of performance on tests that may guide interpretation also appear in Kaufman’s
subtests such as Block Design, Picture Arrangement, and Ob- body of work.
ject Assembly allows for heightened ceilings and increased The Wechsler scales are the most widely used intelligence
reliabilities, it may detract from the construct validity of the tests for identification of intellectually gifted and learning dis-
tests. The Wechsler scales now include a processing speed abled students, individuals with mental retardation, and older
index, so the inclusion of speed dependency in other subtests adults with dementias and disabilities. In spite of its deep en-
is unnecessary and redundant. trenchment among practitioners and thousands of research
publications, its principal value is still based upon its measure-
Interpretive Indexes and Applications ment of the general factor g and its practical verbal-nonverbal
split—both very old concepts.
Wechsler is reported to have administered and interpreted his
own tests in a way that would be considered unacceptable
Woodcock-Johnson Tests of Cognitive Abilities
today. For example, in practice he was known to administer
the Vocabulary subtest alone to estimate intelligence and The Woodcock-Johnson III Tests of Cognitive Abilities
personality (Adam F. Wechsler, personal communication, (WJ III Cog; Woodcock, McGrew, & Mather, 2001a) repre-
December 3, 1993). Weider (1995) reports, “He never gave the sent the most recent revision of an assessment battery with
Wechsler the same way twice” and considered the standardiza- prior editions from 1977 and 1989. Normed for use from ages
tion of his tests to be imposed upon him by the test publisher. 2 through 90+ years, the WJ III Cog is conormed with a lead-
Kaufman (1994) has described Wechsler’s clinical approach to ing achievement test. The battery’s origins may be traced to
interpreting the scales, along with his interest in qualitative as- Richard W. Woodcock’s employment in a sawmill and a
pects of examinee responses to emotionally loaded verbal and butcher shop, where he earned about $1.00 per hour, after
pictorial stimuli. One need only read Wechsler’s (1939) The completion of military duty in the navy during World War II.
Measurement of Adult Intelligence to see that he interpreted Upon reading Wechsler’s (1939) Measurement of Adult Intel-
every test behavior, every item response, every response error, ligence, Woodcock was inspired to study psychology, quit his
and every problem-solving strategy. previous job, and joined the Veteran’s Testing Bureau for a
Interpretations are derived from a decidedly formulaic wage of $0.55 per hour! Woodcock began active develop-
and psychometric approach, based upon global composites, ment of the WJ Cog in 1963 in a series of controlled learn-
verbal and performance standard scores, factor indexes, and ing experiments and furthered its development during a
subtest scaled scores. Contemporary interpretation of the 1974–1975 fellowship in neuropsychology at Tufts Univer-
Wechsler intelligence scales typically involves a hierarchical sity, where he adapted the Category Test. The first edition of
approach involving (a) interpretation of the FSIQ if there is the WJ Cog was published in 1977. Unlike prior editions, the
relatively little scatter in the verbal and performance scales or WJ III Cog yields an intelligence composite score and ex-
index composites, (b) interpreting the verbal and performance plicitly presents itself as a multifactor intelligence test.
Descriptions of the Major Intelligence Tests 431
TABLE 18.5 Wechsler Intelligence Scales (WISC-III and WAIS-III) extended battery. All items are administered from an easel or
Composite Indexes Description audiotape. The WJ III Cog requires computer scoring and
Full Scale IQ (FSIQ) Average level of cognitive functioning, cannot be scored by hand. The WJ III Cog is distinguished
sampling performance across a wide variety from other intelligence tests by the elegance of its factorial
of complex verbal and performance tasks. structure, but its strength as a factor-driven instrument is
General Ability Index (GAI) Overall level of cognitive functioning, offset by the absence of demonstrated clinical relevance for
based on subtests strongly associated with
general intelligence or g; available for the its factors.
WISC-III only; see Prifitera, Weiss, and
Saklofske (1998).
Theoretical Underpinnings
Verbal IQ (VIQ) Average cognitive ability on verbal-
language-based tasks requiring declarative
knowledge and problem solving, varying
The WJ III Tests of Cognitive Abilities is based upon what has
in the complexity of problem-solving been called the Cattell-Horn-Carroll (CHC) theory of cogni-
operations, the degree of abstract tive abilities, but it has also been referred to in the literature as
reasoning required, and the extent of
Horn-Cattell theory, fluid and crystallized intelligence theory,
the required verbal response.
and extended Gf-Gc theory. The theory has been described as
Performance IQ (PIQ) Average cognitive ability on performance
tasks with reduced language emphasis; a hierarchical, multiple-stratum model with g or general intel-
dependent on spatial cognition, fine motor ligence at the apex (or highest stratum), 7–10 broad factors of
coordination, and ideational and intelligence at the second stratum, and at least 69 narrow fac-
psychomotor speed.
tors at the first stratum. The model has recently been termed
Verbal Comprehension Responses to language-based tasks
Index (VCI) requiring crystallized-declarative an integrated or synthesized CHC framework (McGrew,
knowledge and limited problem solving, 1997; McGrew & Flanagan, 1998), and it forms the basis for
varying in the degree of abstract the cross-battery approach to cognitive assessment. With the
reasoning required and expressive
language requirements (based on
WJ III Cog as the anchor for (and only relatively complete
Information, Similarities, Vocabulary, representation of ) this model, it attempts to resolve incon-
and Comprehension subtests). gruities between the work of Horn, Carroll, and others. Our
Perceptual Organization Performance on tasks making high focus here is primarily upon the seven broad cognitive abili-
Index (POI) demands on spatial cognition, motor
ties tapped by the WJ III Cog (Gc, Glr, Gv, Ga, Gf, Gs, and
coordination, and ideational speed (based
on Picture Completion, Picture Gsm; abbreviations are explained in the following discussion)
Arrangement, Block Design, and Object and their contribution to the General Intellectual Ability
Assembly subtests). score, which is a differentially weighted estimate of g. The WJ
Freedom From Distractibility Auditory immediate-working memory III technical manual (McGrew & Woodcock, 2001) reports
Index (FDI) and Working capacity, dependent on capacity and
Memory Index (WMI) complexity of mental operations as well the smoothed g weights; in descending order, the most
as facility with number processing (based weighted tests are Gc, Gf, Glr, Gsm, Ga, Gs, and Gv. This
on Digit Span and Arithmetic subtests in weighting scheme represents a major point of departure from
WISC-III; Arithmetic, Digit Span, and
Letter-Number Sequencing in WAIS-III).
prior investigations (e.g., Carroll, 1993; Gustafsson, 1984,
Processing Speed Index (PSI) Efficiency of performance on
1988; Undheim, 1981) establishing Gf as the most substantial
psychomotor tasks with low to moderate contributor to g. In practical terms, it expresses the idea that
cognitive processing demands; learned information contributes more to one’s intelligence
nonspecifically sensitive to nature and
than does one’s ability to reason.
severity of disruptions in cognitive
processing from a variety of disorders
(based on Coding–Digit Symbol and
Symbol Search subtests).
Standardization Features and Psychometric Adequacy
was statistically weighted to correct for proportional under- that test standard scores extend from 0 to over 200 (p. 72;
representation of selected groups, including Hispanics and Mather & Woodcock, 2001), but this range seems inflated,
parents with education levels below high school completion. given that adequate test floors tend to be difficult to achieve
It is not possible to assess the degree to which the sample is with certain age groups such as preschool children.
representative of the general population because accuracy is The WJ III GIA score tends to be highly correlated with
only reported on the margins without detailed reporting composites from other intelligence tests, although correla-
across stratification variables. Accordingly, it is possible that tions are not corrected for restricted or expanded ranges.
minorities in the sample are not representative of the general According to the WJ III technical manual, the GIA standard
population in terms of age, sex, or parent education level. scale correlates .76 with the DAS General Conceptual Ability,
Sample weighting under these circumstances may magnify .75 with the KAIT Composite Intelligence Scale, .76 with the
errors associated with inaccuracy in specific sampling cells. Stanford-Binet Composite SAS, .76 with the WISC-III FSIQ,
Some irregularities appear in the samples reported in the test and .67 with the WAIS-III FSIQ.
technical manual (McGrew & Woodcock, 2001)—for exam- Factor analytic studies of the WJ III constitute an area of
ple, of the 2,241 children from ages 9 to 13 reported in the concern for a test battery that has historically based its foun-
norming sample (p. 18), only 1,875 completed the verbal dation on the work of Cattell, Horn, and Carroll. Exploratory
comprehension test, only 1,454 took the planning test, and factor analyses are not reported in the technical manual,
only 561 obtained scores on the pair cancellation test (p. 161). although the addition of eight new subtests to the WJ III Cog
The pair cancellation sample suggests that as many as 75% of certainly justifies these analyses. The new WJ III Cog sub-
the sample may have not completed some tests for some age tests purport to measure working memory, planning, naming
groups in the WJ III Cog. speed, and attention. Moreover, hierarchical exploratory fac-
Test internal consistency was calculated with the split-half tor analyses conducted by John B. Carroll (using the same
procedure with Spearman-Brown correction and with Rasch approach described in his 1993 book) have been previously
procedures for eight tests that were either speeded or con- reported for the WJ-R (see McGrew, Werder, & Woodcock,
tained multiple point scoring. Test score reliability appears to 1991, p. 172); these analyses yield findings of first-order and
be fully adequate, with median values across age falling below second-order factors that are not entirely congruent with the
r = .80 for picture recognition and planning only. The clusters structure of the WJ Cog. As a basis for comparison, other
also tend to be highly reliable, with all but three having me- tests in their third editions (e.g., WISC-III, WAIS-III) con-
dian values above .90 (the exceptions are long-term retrieval tinue to report exploratory factor analyses, and tests that re-
at .88, visual-spatial thinking at .81, and short-term memory at ported only confirmatory analyses (e.g., Stanford-Binet) have
.88). The overall composite General Intellectual Ability (GIA) proven to have factor structures that have been effectively
has a median reliability of .97 for the standard battery and .98 challenged (e.g., Sattler, 1988; R. M. Thorndike, 1990). With
for the extended battery. Test-retest score reliabilities are the exception of the Stanford-Binet and the WJ III, every test
reported in Rasch ability units for selected tests at varying discussed in this chapter reports the results of exploratory
test-retest intervals, with no apparent correction for variability factor analyses.
at the time of first testing, thereby probably yielding artifi- The confirmatory factor analyses (CFAs) reported in the
cially inflated values because of the large standard deviations. WJ III technical manual appear to provide marginal support
Accordingly, these findings are reported with caution. The five for a seven-factor structure relative to two alternative mod-
speeded WJ III Cog tests have a median 1-day stability coeffi- els, but the root mean squared errors of approximation
cient of .81 (range from .78 to .87) for ages 7–11, .78 (range (RMSEA, which should ideally be less than .05 with good
from .73 to .85) for ages 14–17, and .73 (range from .69 to .86) model fit) do not support good model fit at any age level. The
for ages 26–79. Test-retest reliabilities for selected tests CFAs involve a contrast between the seven-factor CHC
administered over multiyear intervals—apparently collected structure, a WAIS-based model, and a Stanford-Binet-based
as part of an unspecified longitudinal study using prior edi- model, the latter two with model specifications that Wechsler
tions of the WJ (tests that no longer appear in the battery are or Stanford-Binet devotees would likely argue are misrepre-
included)—yield a range of stability coefficients from .60 to sentations. None of the models are hierarchical, none include
.86, suggesting that some of the tests have high degrees of a superordinate g, and none include the higher order dimen-
stability over extended periods of time. sions suggested by Woodcock in his cognitive performance
WJ III Cog floors and ceilings cannot be formally evalu- model. Moreover, only three goodness-of-fit indexes are in-
ated because the test may only be computer-scored, and no cluded, whereas best practice with CFAs suggests that fit sta-
printed norms are available. The examiner’s manual reports tistics should ideally include indexes sensitive to model fit,
Intellectual Assessment and Diagnostic Classification 433
model comparison, and model parsimony. On a model built integrative, explanatory, and predictive glue that constitutes a
on multifactor foundations, it may be argued that a more scientific theory. To their credit, advocates for the WJ Cog have
rigorous CFA test of alternative models is appropriate. acknowledged this shortcoming: “Gf-Gc provides little infor-
mation on how the Gf-Gc abilities develop or how the cogni-
tive processes work together. The theory is largely product
Interpretive Indexes and Applications
oriented and provides little guidance on the dynamic interplay
The WJ III Cog consists of 20 tests purporting to measure of variables (i.e., the processes) that occur in human cognitive
seven broad cognitive factors. The tests are organized into a processing” (Flanagan, McGrew, & Ortiz, 2000, p. 61).
standard battery (Tests 1 through 7, with three supplemental The WJ III Cog also has little demonstrated diagnostic
tests) and an extended battery (Tests 1 through 7 and Tests 11 value. The technical manual includes no investigations of sam-
through 17, with six supplemental tests). The WJ III Cog is ples of mentally retarded or intellectually gifted individuals—
normed for ages 2 years through 90+ years and is conormed the only intelligence test in this chapter failing to report
with 22 tests in an achievement battery—WJ III Tests of findings with these important criterion groups. Three studies
Achievement (Woodcock, McGrew, & Mather, 2001b). with other special populations—two with ADHD and one with
In spite of the factor analytic findings reported in the pre- a college learning disabled sample—fail to include a norma-
ceding section, the WJ III Cog model is an elegant exemplar tive comparison group or report any indexes of effect size or
of the multifactor approach to cognitive abilities. Its factor statistical significance testing. In general, these few studies
analytic lineage may be most clearly traced from the pioneer- are consistent with Woodcock’s (1998) report of results with
ing efforts in factor analysis of ability tests by Thurstone 21 diagnostic groups in suggesting that the WJ Cog has limited
(1938) to the encyclopedic tome by Carroll (1993), along value in identifying or differentiating clinical and exceptional
with seminal contributions by Cattell and Horn. It is this as- samples.
sociation to a large body of factor analytic research that con- Finally, the WJ III Cog offers little in the way of empirically
stitutes the WJ III Cog’s main strength. based assessment intervention linkages. Although logical
Unfortunately, a systematic overreliance on this same body interventions and recommendations have been offered in
of factor analytic research as evidence of test validity consti- Mather and Jaffe (1992), there is a conspicuous absence of em-
tutes the most substantial weakness of the WJ III Cog. The pirical verification for these assessment-intervention linkages.
WJ III Cog structure is a structural model missing the In spite of its apparent assets, the WJ III Cog is absent a coher-
ent theoretical framework, established clinical correlates, and
empirically demonstrated treatment utility—an unsatisfying
TABLE 18.6 Woodcock-Johnson III Tests of Cognitive Abilities
(Woodcock, McGrew, & Mather, 2001a) state of affairs for a factorial model of nearly 70 years’ dura-
tion and a cognitive battery available for 25 years and now in
Composite Indexes Description
its third edition. Kaufman (2000), in referring to the Carroll,
General Intellectual A weighted estimate of general cognitive
Ability (GIA) ability.
Horn, and Cattell models, suggested that “there is no empirical
Comprehension- The breadth and depth of prior learning
evidence that these approaches yield profiles for exceptional
Knowledge (GC) about both verbal facts and information. children, are directly relevant to diagnosis, or have rele-
Long-Term Retrieval (Glr) The ability to efficiently acquire and store vance to eligibility decisions, intervention or instructional
information, measured by long-term and planning—all of which are pertinent for school psychologists”
remote retrieval processes.
(p. 27). Accordingly, the WJ III Cog provides clear evidence
Visual Processing (Gv) Analysis and synthesis of spatial-visual
that claims of test structural validity are unrelated to its applied
stimuli and the ability to hold and
manipulate mental images. utility for clinical and educational decision making.
Auditory Processing (Ga) The ability to discriminate, analyze, and
synthesize auditory stimuli; also related to
phonological awareness. INTELLECTUAL ASSESSMENT
Fluid Reasoning (Gf) The ability to solve novel, abstract, visual, AND DIAGNOSTIC CLASSIFICATION
and nonverbal problems.
Short-Term Memory (Gsm) The ability to hold, transform, and act
upon auditory information in immediate In this section, general approaches to diagnostic utility of
awareness; the capacity of the auditory intelligence tests are described, specifically listing several
loop in mental operating space. diagnostic categories that are operationally defined through
Processing Speed (Gs) Speed and efficiency in performing simple the use of cognitive or intelligence tests. As suggested at the
cognitive tasks.
beginning of this chapter, one characteristic of a mature
434 Assessment of Intellectual Functioning
clinical science is the generation of a coherent diagnostic tax- The 1992 definition from the American Association on
onomy, derived from and consistent with theory. A theory of Mental Retardation (AAMR; Luckasson et al., 1992) shifts the
intelligence (and tests developed according to the theory) emphasis from subtyping on the basis of IQ ranges alone
should have value in generating a classification system by toward an assessment of the degrees of support required to
which clusters of individuals sharing common clinical char- function well intellectually, adaptively, psychologically, emo-
acteristics may be systematically and meaningfully grouped. tionally, and physically. The AAMR definition involves a
A classificatory taxonomy based on intelligence test results three-step procedure for diagnosing, classifying, and determin-
started at the beginning of the century by assigning individu- ing the needed supports of individuals with mental retardation:
als who were at the extreme ends of the distribution of gen- (a) an IQ of 70–75 or below, with significant disabilities in
eral intelligence to diagnostic groups now known as mental two or more adaptive skill areas and age of onset below 18;
retardation and intellectual giftedness. (b) identification of strengths and weaknesses and the need
The Diagnostic and Statistical Manual of Mental for support across four dimensions (intellectual functioning
Disorders–Fourth Edition–Text Revision (DSM-IV-TR; Amer- and adaptive skills, psychological-emotional considerations,
ican Psychiatric Association, 2000)—the most recent edi- physical-health-etiological considerations, and environmental
tion—contains several diagnostic classes based upon criteria considerations); and (c) identification of the kinds and intensi-
related to cognitive or intelligence test results, including men- ties of supports needed for each of the four dimensions. The
tal retardation, learning disorders, dementia, and a proposed four classification levels for mental retardation are intermittent
new category, mild neurocognitive disorder. Amnestic disor- (need for support during stressful or transition periods but not
ders are defined by a disturbance in memory functioning that constantly), limited (less intense, consistent supports needed
may be specifically quantified with neuropsychological test- but time limited for changing situations), extensive (long-term
ing, although several intelligence tests include measures of consistent support at work, at home, or both), and pervasive
long-term memory ability that may be useful in arriving at a di- (very intense, long-term, constant support needed across most
agnosis of amnesia. or all situations). Intelligence tests continue to play a role in the
diagnosis of mental retardation, although their role has been
Individuals With Mental Retardation slightly de-emphasized in the AAMR definition.
There are several diagnostic approaches to identifying in- Individuals Who Are Intellectually Gifted
dividuals with mental retardation, some of which rely on
intellectual disability and impairment in areas of adaptive be- Giftedness has traditionally been defined in terms of elevated
havior. The DSM-IV-TR requires significantly subaverage general intelligence (Hollingworth, 1942; Terman, 1925). In
general intellectual functioning, accompanied by significant 1972 the U.S. federal government adopted its first definition
limitations in adaptive functioning in at least two of the fol- of gifted and talented students; this definition was based on a
lowing skill areas: communication, self-care, home living, report to Congress from former U.S. Commissioner of Edu-
social-interpersonal skills, use of community resources, self- cation Sidney P. Marland:
direction, functional academic skills, work, leisure, health,
Gifted and talented children are those, identified by profession-
and safety. Table 18.7 contains a summary of these criteria ally qualified persons, who by virtue of outstanding abilities
(American Psychological Association Division 33 Editorial are capable of high performance. These children who require
Board, 1996). Onset must occur during the developmental differentiated programs and/or services beyond those normally
period, and deficits are expected to adversely affect a indi- provided by the regular school program in order to realize their
vidual’s educational performance. contribution to self and society. Children capable of high perfor-
mance include those with demonstrated high achievement and/or
TABLE 18.7 Levels of Mental Retardation potential ability in any of the following areas, singly or in com-
bination; general intellectual ability, specific academic aptitude,
IQ Deviation Extent of Concurrent
Level IQ Range Cutting Point Adaptive Limitations
creative or productive thinking, leadership ability, visual and
performing arts, and/or psychomotor ability. (p. 2)
Mild 50–55 to 70–75 ⫺2 SD Two or more domains.
Moderate 35– 40 to 50–55 ⫺3 SD Two or more domains. This definition and subsequent public law does not, however,
Severe 20–25 to 35–40 ⫺4 SD All domains.
Profound below 20 or 25 ⫺5 SD All domains. mandate that gifted and talented students are served in special
education, and states and individual school districts vary as to
Note. IQ range scores are for a test with a standard score mean of 100 and
SD of 15. Adapted from American Psychological Association Division 33 how they define giftedness and whom they serve. High level
Editorial Board (1996). of intelligence remains the most common single criterion of
Intellectual Assessment and Intervention 435
TABLE 18.8 Levels of Intellectual Giftedness Alzheimer’s disease is the most common dementia. Diagnostic
IQ Deviation criteria for dementia appearing in the International Classifica-
Level IQ Range Cutting Point tion of Diseases–Tenth Edition (ICD-10; World Health Orga-
Profoundly gifted above 175–180 ⫹5 SD nization, 1992) include a decline in memory; a decline in other
Exceptionally gifted 160–174 ⫹4 SD cognitive abilities, characterized by deterioration in judgment
Highly gifted 145–159 ⫹3 SD
Gifted 130–144 ⫹2 SD and thinking such as planning, organizing, and general pro-
cessing of information; and preserved awareness of the envi-
Note. IQ range scores are for a test with a standard score mean of 100 and
SD of 15. ronment. Other criteria include a decline in emotional control
and a minimal duration of 6 months. The use of mental status
examination results are sometimes sufficient to arrive at a di-
giftedness (Callahan, 1996), although the use of multiple mea-
agnosis of dementia, but formal cognitive and neuropsycho-
sures and approaches transcending intelligence tests alone is
logical assessment is often necessary to fully document the
considered to constitute best assessment practice (Gallagher,
nature and extent of any suspected deterioration. Identification
1994).
of dementias constitute the raison d’être to administer intelli-
Levels of intellectual giftedness appear in Table 18.8
gence tests to older adults.
and have appeared in various forms in the literature (Gross,
1993; Hollingworth, 1942; Terman, 1925). Few intelligence
tests have sufficient ceiling to serve the upper levels; as a
INTELLECTUAL ASSESSMENT
result, comparatively little research on exceptionally and
AND INTERVENTION
profoundly gifted children has been conducted.
Perhaps the most telling indicator of the existing intervention
Individuals With Specific Learning Disabilities utility of intelligence tests may be found in Maruish (1999), a
The current reauthorization of Individuals with Disabilities 1,500-page tome on the use of psychological testing for treat-
Education Act (IDEA; PL 105-17) defines specific learning ment planning with no mention of intelligence or IQ. After
disability as “a disorder in one or more of the basic psycholog- nearly a century and in what must be considered one of applied
ical processes” involved in language comprehension, language psychology’s greatest failures, intellectual assessment has not
expression, reading, writing, spelling, or mathematics. Spe- been systematically linked to effective interventions. Several
cific learning disabilities are operationally assessed in different high-profile failures to link cognitive profiles to treatment
ways; the most common ones are (a) significant discrepancies (e.g., Kaufman & Kaufman, 1983b; Kirk, McCarthy, & Kirk,
between measured intelligence and academic achievement 1968) have deservedly led practitioners toward skepticism
skills, and (b) isolated relative weaknesses in core cognitive about the promise of including research-based recommenda-
processes such as phonological awareness that are thought to tions in their psychological reports. There is, however, reason
contribute to the development of reading decoding skills and for guarded optimism regarding the future of assessment—
subsequent success in reading. In both of these approaches, the intervention linkages based upon new remediation programs
role of cognitive-intellectual assessment is central. that utilize principles from cognitive instruction and neuronal
Both assessment approaches have their limitations. The plasticity. In this section, a few of these interventions are ex-
intelligence-achievement discrepancy model as a basis for amined as well as some historical perspectives in intelligence-
identifying reading disability has been criticized for its im- related intervention research.
plicit assumption that intelligence predicts reading potential Studies linking intelligence assessment to intervention
(e.g., Stanovich, 1991a, 1991b). The cognitive processing ap- date to the origins of intelligence testing. Binet (1909/1975)
proach requires that the specific processes contributing to was unequivocal about his belief in the effectiveness of
performance in reading, for example, be included as part of cognitive intervention, arguing that education tailored to a
an assessment intended to detect reading disability. Most in- child’s aptitudes could increase intelligence: “Twenty-five
telligence tests do not include tests of these specialized abili- years of experimentation in schools have led me to believe
ties and processes as part of their battery. that the most important task of teaching and education is
the identification of children’s aptitudes. The child’s aptitudes
must dictate the kind of education he will receive and the pro-
Individuals With Dementias
fession toward which he will be oriented” (Binet, 1909/1975,
Dementia refers to a generalized deterioration in cognitive p. 23). He described programs that were antecedents to
functioning relative to a previously higher level of functioning. special education that partitioned mentally retarded students
436 Assessment of Intellectual Functioning
according to their intellectual abilities. Moreover, he de- Kaufman, 1983b) that sought to match instruction to learning
scribed a series of exercises called mental orthopedics that styles have generally tended to yield disappointing findings.
were intended to enhance the efficiency of the cognitive fac- Only two major tests, the CAS and the K-ABC, even address
ulties, especially in mentally handicapped children. treatment and intervention in their manuals.
Contemporary investigations into intervention utility may Assessment-intervention linkages in intelligence and ATI
be traced to the introduction, by Lee J. Cronbach (1957), of are being explored in both old and new areas: cognitive
the concept of aptitude by treatment interactions (ATI). In instruction and computerized instruction. In the following
formulating assessment recommendations, Cronbach recom- sections, illustrative examples are provided of new types of
mended that applied psychologists consider individual differ- interventions for individuals with deficits identified through
ences and treatments simultaneously in order to select the cognitive and intelligence testing. These interventions repre-
best group of interventions to use and the optimal allocation sent beginnings for a larger body of work likely to evolve in
of individuals to interventions. Of ATI, he predicted that the near future.
“ultimately we should design treatments, not to fit the aver-
age person, but to fit groups of students with particular apti-
Cognitive Instruction
tude patterns. Conversely, we should seek out the aptitudes
which correspond to (interact with) modifiable aspects of the The study of cognitive instruction is concerned with the inter-
treatment” (Cronbach, 1957, pp. 680–681). face between psychology and education—particularly the
In collaboration with Cronbach, Richard E. Snow devel- cognitive processes involved in learning (e.g., Mayer, 1992).
oped a sampling-assembly-affordance model of ATI with In this section, a representative series of studies is described
the objective of elucidating person-treatment matching ap- linking cognitive assessment to a program of educational in-
proaches in learning and instruction. Snow emphasized that struction, based upon PASS theory as measured in the CAS
ATI involved the complex interaction between persons and (Naglieri & Das, 1997a). Compendiums of other cognitive
situations, with aptitude defined as “relatively stable psycho- instructional methods of demonstrated efficacy are available
logical characteristics of individuals that predispose and thus from Ashman and Conway (1993) and Pressley and Woloshyn
predict differences in later learning under specified instruc- (1995).
tional conditions” (Snow, 1998, p. 93). The true focus of study, The planning facilitation method described by Naglieri
he argued, should be neither the main effects of the treatment (1999) is an intervention that may be applied to individual or
nor the characteristics of the learner, but rather the interface groups of children in as few as three 10-min sessions per
between the two. Moreover, Snow (1998) recommended that week. It involves a nondirective emphasis on self-reflection,
ATI serve as additional criteria for construct validation beyond planning, and use of efficient problem-solving strategies and
traditional validation approaches, insofar as ATI requires is taught through classroom group discussions led by teachers.
determination of the situational boundaries within which an It is intended to stimulate children’s use of planning, based on
ability can predict learning and ATI requires experimental the assumption that planning processes should be facilitated
manipulation of abilities within circumscribed situations. rather than directly instructed so that children discover the
The success of cognitive and intelligence tests as tools in value of strategy use without specific instruction.
establishing ATI has been modest at best. Traditional intelli- The planning facilitation method may be administered
gence tests such as the Wechsler scales have never been em- following a classroom assignment, such as completion of an
pirically linked to intervention, leading authorities to decry arithmetic worksheet. After students have worked on the
the “virtual absence of empirical evidence supporting the ex- problems, the teacher facilitates a discussion intended to en-
istence of aptitude x treatment interactions” with intelligence courage students to consider various ways to be more suc-
tests (Gresham & Witt, 1997, p. 249). Witt and Gresham cessful in completion of the assignment. The teacher typically
(1985) specifically commented the following on the Wechsler offers probes or nondirective questions such as How did you
scales: “In short, the WISC-R lacks treatment validity in that do the math?, What could you do to get more correct?, or What
its use does not enhance remediation interventions for chil- will you do next time? Student responses become a beginning
dren who show specific academic skills deficiencies. . . . For point for discussions and further development of ideas. Teach-
a test to have treatment validity, it must lead to better ers are instructed to make no direct statements like That is cor-
treatments (i.e., better educational programs, teaching strate- rect or Remember to use that same strategy, nor do they
gies, etc.)” (p. 1717). Tests with theory-driven remedial ap- provide feedback on the accuracy on worksheets. Moreover,
proaches such as the Illinois Test of Psycholinguistic Ability they do not give mathematics instruction. The sole role of the
(ITPA; Kirk et al., 1968) and the K-ABC (Kaufman & teacher is to facilitate self-reflection, thereby encouraging the
Intellectual Assessment and Intervention 437
students to plan so that they can more effectively complete TABLE 18.9 Summary of Planning Facilitation Research
Investigations: Percentage of Change From Baseline to Intervention
their worksheet assignment. In response to the planning facil-
for Children With High or Low Planning Scores
itation method, students arrive at their own problem-solving
High Low
approaches, selectively incorporating any ideas from the class Study Planning Planning Difference
discussion that are perceived to be useful.
Cormier, Carlson, & Das (1990) 5% 29% 24%
The initial investigations of planning facilitation were con- Kar, Dash, Das, & Carlson (1992) 15% 84% 69%
ducted based on PASS theory by Cormier, Carlson, and Das Naglieri & Gottling (1995) 26% 178% 152%
(1990) and Kar, Dash, Das, and Carlson (1992). Both investi- Naglieri & Gottling (1997) 42% 80% 38%
Naglieri & Johnson (2000) 11% 143% 132%
gations demonstrated that students differentially benefited Median values across all studies 15% 84% 69%
from a verbalization technique intended to facilitate planning.
Participants who initially performed poorly on measures of
planning earned significantly higher scores than did those with weakness in Planning improved considerably over baseline
good scores in planning. The verbalization method encour- rates, whereas those with no cognitive weakness improved
aged a carefully planned and organized examination of the de- only marginally. Children with cognitive weaknesses in the
mands of the task, differentially benefiting the children with Simultaneous, Successive, and Attention scales also showed
low planning scores. substantially lower rates of improvement. These three stud-
These investigations were the basis for three experiments by ies, summarized with the two previous investigations in
Naglieri and Gottling (1995, 1997) and Naglieri and Johnson Table 18.9, illustrate that PASS cognitive processes are rele-
(2000). The three studies focused on improving math calcula- vant to effective educational intervention in children with and
tion performance through teacher delivery of planning facilita- without learning disabilities.
tion about two to three times per week. Teachers also consulted
with school psychologists on a weekly basis to assist in the
Computerized Instruction
application of the intervention, monitor the progress of the stu-
dents, and consider ways of facilitating classroom discussions. The prospects that highly individualized and tailored pro-
Students completed mathematics worksheets in a sequence grams of instruction and remediation may be delivered by
of about 7 baseline and 21 intervention sessions over about a computer represents a new trend needing validation and in-
2-month period. In the intervention phase, the students were dependent verification. Based upon findings that phonemic
given a 10-min period for completing a mathematics work- discrimination deficits contribute to reading problems, de-
sheet, a 10-min period was used for facilitating planning, and a coding impairments, and various language problems (e.g.,
second 10-min period was allocated for another mathematics Anderson et al., 1993), one promising technology-based pro-
worksheet. All students were given intervention sessions in- gram of instruction uses acoustically modified sounds and
volving the three 10-min segments of mathematics-discussion- cross-training methods to directly train phoneme discrimina-
mathematics in 30-min instructional periods. tion. Known as Fast ForWord (Tallal, 2000), the training
The first two research studies by Naglieri and Gottling program resembles a computer game and features adaptive
(1995, 1997) demonstrated that planning facilitation led to instruction (centered slightly above the examinee’s level of
improved performance on multiplication problems for those mastery), highly intensive and frequent training (for 100 min
with low scores in planning, but minimal improvement was per day, 5 days per week, over 4–8 weeks), and high levels of
found for those with high planning scores. Thus, students reinforcement (through the use of computer-delivered rein-
benefited differentially from instruction based on their cog- forcement). The Fast ForWord training program reportedly
nitive processing patterns. Using the planning facilitation yields statistically significant improvement in temporal pro-
method with a larger sample of children with learning prob- cessing thresholds, speech discrimination, and listening
lems, Naglieri and Johnson (2000) sought to determine comprehension, and it results in a significant shift along the
whether children with specific PASS profiles would show dif- normal distribution of language comprehension scores for
ferent rates of improvement on mathematics performance. academically at-risk children (Tallal, 2000). Moreover, the
Children with cognitive weaknesses (i.e., an individual PASS training program purports to exploit the dynamic plasticity of
standard score below 85 and significantly lower than the the brain by remapping neural circuitry associated with
child’s own mean) in either the Planning, Attention, Simulta- phonemic discrimination (Tallal, 2000). Independent verifi-
neous, or Successive scales were selected to form contrast cation of treatment effectiveness has yet to be reported for
groups. The contrasting groups of children responded very this program, but more such programs can be expected to be
differently to the intervention. Children with a cognitive developed as technological interventions continue to affect
438 Assessment of Intellectual Functioning
educational practices. These new generation interventions school, for example, will (a) yield results commensurate with
may provide opportunities to link cognitive assessment to the ways in which we know learning to occur, (b) describe the
focused interventions. impaired cognitive abilities-processes that specifically con-
tribute to the learning problems, (c) assess the degree to which
the child’s ability-process profile resembles that obtained by
TOWARD A MATURE CLINICAL SCIENCE other children in diagnostic groups with similar patterns of
learning problems, and (d) prescribe a series of interventions
Whither goeth intellectual assessment? Most of the subtest that have been demonstrated to be effective in addressing the
procedures currently in use were created before 1930, and the special needs of children with similar test score profiles.
leading interpretive models of intelligence date back nearly The combination of a well-developed theory, valid and reliable
as far. As Oscar K. Buros commented in 1977, “. . . except for tests, a cognitive diagnostic nomenclature related to abilities
the tremendous advances in electronic scoring, analysis, and and processes, and effective interventions linked to assessment
reporting of test results, we don’t have a great deal to show may one day enable the field of intelligence assessment to be-
for fifty years of work” (p. 10). come a mature applied clinical science.
If the past provides the best prediction of the future, then by
around the year 2050 we may expect seventh-edition revisions
REFERENCES
of the Stanford-Binet, the WISC, and the WAIS. As computer
usage and online test scoring applications continue to grow
American Association on Mental Retardation (AAMR). (1992).
among practitioners, these tests may be expected to feature Mental retardation: Definition, classification, and systems of sup-
more sophisticated technology, including online administra- ports (9th ed.). Washington, DC: Author.
tion, scoring, and interpretation. Computer administration also American Psychiatric Association. (2000). Diagnostic and statisti-
permits more accurate adaptive testing, so the duration of as- cal manual of mental disorders (4th ed., Text Revision).
sessment batteries should grow progressively shorter and fo- Washington, DC: Author.
cused around an examinee’s ability level. Psychometric American Psychological Association, Division 33 Editorial Board.
techniques such as Rasch scaling have had little discernible (1996). Definition of mental retardation. In J. W. Jacobson &
impact on the material substance of intellectual tests thus far, J. A. Mulick (Eds.), Manual of diagnosis and professional
but as psychometric techniques evolve, the process of test practice in mental retardation (pp. 13–53). Washington, DC:
development should become more efficient and streamlined, American Psychological Association.
reducing test development time and costs and offering practi- Anastasi, A. (1989). Review of the Stanford-Binet Intelligence
tioners more choices in intelligence assessment. Scale, Fourth Edition. In J. C. Conoley & J. J. Kramer (Eds.),
Neurophysiological assessment has been described as one The tenth mental measurements yearbook (pp. 771–773).
methodology that may eventually supercede psychometric Lincoln, NE: Buros Institute of Mental Measurements.
assessment. For example, Matarazzo (1992) speculated that Anderson, K., Brown, C., & Tallal, P. (1993). Developmental lan-
guage disorders: Evidence for a basic processing deficit. Current
the future of intelligence testing is to “record individual dif-
Opinion in Neurology and Neurosurgery, 6, 98–106.
ferences in brain functions at the neuromolecular, neurophys-
Ashman, A. F., & Conway, R. N. F. (1993). Using cognitive methods
iologic, and neurochemical levels” (p. 1007). Among the
in the classroom. New York: Routledge.
current neurophysiological techniques that show promise in-
Ashman, A. F., & Das, J. P. (1980). Relation between planning and
clude evoked potentials and nerve conduction velocity, quan-
simultaneous-successive processing. Perceptual and Motor
titative electroencephalography, and measures of cerebral
Skills, 51, 371–382.
glucose metabolism.
Aylward, G. P. (1992). Differential Abilities Scales. In J. J. Kramer &
More important than changes in technology, however, will J. C. Conoley (Eds.), The eleventh mental measurements year-
be changes in fundamental assessment paradigms. Science book (pp. 281–282). Lincoln, NE: Buros Institute of Mental
does not advance slowly and gradually, but rather in brief Measurements.
periods of intense change, reappraisal, and upheaval (e.g., Barkley, R. A. (1997). ADHD and the nature of self-control. New
Kuhn, 1970). Challenges to conventional thinking in intelli- York: Guilford Press.
gence assessment have laid the groundwork for a paradigm Binet, A. (1975). Modern ideas about children (S. Heisler, Trans.).
shift, and that new tests delivering additional applied value to Menlo Park, CA: Suzanne Heisler. (Original work published
the practitioner have the greatest likelihood of success in the 1909)
future. It is possible to envision a time when the psychological Binet, A., & Simon, T. (1916). The development of intelligence in
assessment results for a child referred for learning problems in the child. In E. S. Kite (Trans.), The development of intelligence
References 439
in children: The Binet-Simon Scale (pp. 182–273). Baltimore: Elliott, S. N. (1990). The nature and structure of the DAS: Ques-
Williams and Wilkins. (Original work published 1908) tioning the test’s organizing model and use. Journal of Psycho-
Bracken, B. A. (1985). A critical review of the Kaufman Assessment educational Assessment, 8, 406–411.
Battery for Children (K-ABC). School Psychology Review, 14, Flanagan, D. P., McGrew, K. S., & Ortiz, S. O. (2000). The Wechsler
21–36. intelligence scales and Gf-Gc theory: A contemporary approach
Buros, O. K. (1977). Fifty years in testing: Some reminiscences, crit- to interpretation. Needham Heights, MA: Allyn and Bacon.
icisms, and suggestions. Educational Researcher, 6, 9–15. Gallagher, J. J. (1994). Current and historical thinking on education
Callahan, C. M. (1996). A critical self-study of gifted education: for gifted and talented students. In P. Ross (Ed.), National excel-
Healthy practice, necessary evil, or sedition? Journal for the lence: An anthology of readings (pp. 83–107). Washington, DC:
Education of the Gifted, 19, 148–163. U.S. Department of Education.
Camara, W. J., Nathan, J. S., & Puente, A. E. (2000). Psychological Gresham, F. M., & Witt, J. C. (1997). Utility of intelligence tests
test usage: Implications in professional psychology. Professional for treatment planning, classification, and placement decisions:
Psychology: Research and Practice, 31, 141–154. Recent empirical findings and future directions. School Psychol-
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor- ogy Quarterly, 12, 249–267.
analytic studies. New York: Cambridge University Press. Grice, J. W., Krohn, E. J., & Logerquist, S. (1999). Cross-validation
Carroll, J. B. (1995). Review of the book Assessment of cognitive of the WISC-III factor structure in two samples of children with
processing: The PASS theory of intelligence. Journal of Psy- learning disabilities. Journal of Psychoeducational Assessment,
choeducational Assessment, 13, 397–409. 17, 236–248.
Cohen, R. J., & Swerdlik, M. E. (1999). Psychological testing and Gross, M. U. M. (1993). Exceptionally gifted children. London:
assessment: An introduction to tests and measurement (4th ed.). Routledge.
Mountain View, CA: Mayfield. Gustafsson, J.-E. (1984). A unifying model of the structure of intel-
Cooper, C. (1995). Inside the WISC-III-UK. Association of Educa- lectual abilities. Intelligence, 8, 179–203.
tional Psychologists Journal, 10, 215–219. Gustafsson, J.-E. (1988). Hierarchical models for individual differ-
Cormier, P., Carlson, J. S., & Das, J. P. (1990). Planning ability and ences in cognitive abilities. In R. J. Sternberg (Ed.), Advances
cognitive performance: The compensatory effects of a dynamic in the psychology of human intelligence (Vol. 4, pp. 35–71).
assessment approach. Learning and Individual Differences, 2, Hillsdale, NJ: Erlbaum.
437–449. Gutentag, S. S., Naglieri, J. A., & Yeates, K. O. (1998). Performance
Cronbach, L. J. (1949). Essentials of psychological testing. New of children with traumatic brain injury on the cognitive assess-
York: Harper. ment system. Assessment, 5, 263–272.
Cronbach, L. J. (1957). The two disciplines of scientific psychology. Hollingworth, L. S. (1942). Children above 180 IQ Stanford-Binet:
American Psychologist, 12, 671–684. Origin and development. Yonkers, NY: World Book.
Cronbach, L. J. (1989). Review of the Stanford-Binet Intelligence Kar, B. C., Dash, U. N., Das, J. P., & Carlson, J. S. (1992). Two ex-
Scale, Fourth Edition. In J. C. Conoley & J. J. Kramer (Eds.), The periments on the dynamic assessment of planning. Learning and
tenth mental measurements yearbook (pp. 773–775). Lincoln, Individual Differences, 5, 13–29.
NE: Buros Institute of Mental Measurements. Kaufman, A. S. (1994). Intelligent testing with the WISC-III. New
Das, J. P., Naglieri, J. A., & Kirby, J. R. (1994). Assessment of cog- York: Wiley.
nitive processes: The PASS theory of intelligence. Needham Kaufman, A. S. (2000). Intelligence tests and school psychology:
Heights, MA: Allyn and Bacon. Predicting the future by studying the past. Psychology in the
Delaney, E. A., & Hopkins, T. F. (1987). The Stanford-Binet Intelli- Schools, 37, 7–16.
gence Scale: Fourth Edition examiner’s handbook. Itasca, IL: Kaufman, A. S., & Kaufman, N. L. (1983a). Kaufman Assessment
Riverside. Battery for Children administration and scoring manual. Circle
Donders, J., & Warschausky, S. (1997). WISC-III factor index score Pines, MN: American Guidance.
pattern after traumatic head injury in children. Child Neuropsy- Kaufman, A. S., & Kaufman, N. L. (1983b). Kaufman Assessment
chology, 3, 71–78. Battery for Children interpretive manual. Circle Pines, MN:
Elliott, C. D. (1983). The British Ability Scales, Manual 1: Intro- American Guidance.
ductory handbook. Windsor, England: NFER-Nelson. Kaufman, A. S., & Kaufman, N. L. (1993). Kaufman Adolescent and
Elliott, C. D. (1990a). Differential Ability Scales. San Antonio, TX: Adult Intelligence Test. Circle Pines, MN: American Guidance.
The Psychological Corporation. Kaufman, A. S., & Lichtenberger, E. O. (1999). Essentials of WAIS-III
Elliott, C. D. (1990b). Differential Ability Scales: Introductory and assessment. New York: Wiley.
technical handbook. San Antonio, TX: The Psychological Kaufman,A. S., & Lichtenberger, E. O. (2000). Essentials of WISC-III
Corporation. and WPPSI-R assessment. New York: Wiley.
440 Assessment of Intellectual Functioning
Keith, T. Z. (1990). Confirmatory and hierarchical confirmatory Luckasson, R., Counter, D. L., Polloway, E. A., Reiss, S., Schalock,
analysis of the Differential Ability Scales. Journal of Psychoed- R. L., Snell, M. E., et al. (1992). Mental retardation: Definition,
ucational Assessment, 8, 391–405. classifications and systems of support (9th ed.). Washington,
Keith, T. Z., & Kranzler, J. H. (1999). The absence of structural DC: American Association on Mental Retardation.
fidelity precludes construct validity: Rejoinder to Naglieri on Luria, A. R. (1963). Restoration of function after brain injury
what the Cognitive Assessment System does and does not mea- (B. Haigh, Trans.). New York: Macmillan.
sure. School Psychology Review, 28, 303–321. Luria, A. R. (1973). The working brain: An introduction to neu-
Keith, T. Z., Kranzler, J. H., & Flanagan, D. P. (2001). What does the ropsychology. New York: Basic Books.
Cognitive Assessment System (CAS) measure? Joint confirma- Luria, A. R. (1980). Higher cortical functions in man. New York:
tory factor analysis of the CAS and the Woodcock-Johnson Tests Basic Books.
of Cognitive Ability—3rd Edition. School Psychology Review,
Luria, A. R., & Tsvetkova, L. S. (1990). The neuropsychological
30, 89–119.
analysis of problem solving (A. Mikheyev & S. Mikheyev,
Keith, T. Z., Quirk, K. J., Schartzer, C., & Elliott, C. D. (1999). Con- Trans.). Orlando, FL: Paul M. Deutsch Press.
struct bias in the Differential Ability Scales? Confirmatory and
Maller, S. J., & Ferron, J. (1997). WISC-III factor invariance across
hierarchical factor structure across three ethnic groups. Journal
deaf and standardization samples. Educational and Psychologi-
of Psychoeducational Assessment, 17, 249–268.
cal Measurement, 57, 987–994.
Keith, T. Z., & Witta, E. L. (1997). Hierarchical and cross-age con-
Marland, S. P. (1972). Education of the gifted and talented: Vol. 1.
firmatory factor analysis of the WISC-III: What does it measure?
Report to the Congress of the United States by the U.S. Commis-
School Psychology Quarterly, 12, 89–107.
sioner of Education. Washington, DC: Government Printing Office.
Kirby, J. R., & Das, J. P. (1978). Information processing and human
Maruish, M. E. (Ed.). (1999). The use of psychological testing
abilities. Journal of Experimental Psychology, 70, 58–66.
for treatment planning and outcomes assessment (2nd ed.).
Kirk, S. A., McCarthy, J. J., & Kirk, W. D. (1968). Illinois Test of Mahwah, NJ: Erlbaum.
Psycholinguistic Abilities. Urbana: University of Illinois Press.
Matarazzo, J. D. (1992). Psychological testing and assessment in the
Kline, R. B., Snyder, J., & Castellanos, M. (1996). Lessons from the 21st century. American Psychologist, 47, 1007–1018.
Kaufman Assessment Battery for Children (K-ABC): Toward a
Mather, N., & Jaffe, L. E. (1992). Woodcock-Johnson Psycho-
new cognitive assessment model. Psychological Assessment, 8,
educational Battery—Revised: Recommendations and reports.
7–17.
Brandon, VT: Clinical Psychology Publishing.
Konold, T. R., Kush, J. C., & Canivez, G. L. (1997). Factor replica-
Mather, N., & Woodcock, R. W. (2001). Woodcock-Johnson III Tests
tion of the WISC-III in three independent samples of children
of Cognitive Abilities examiner’s manual: Standard and ex-
receiving special education. Journal of Psychoeducational
tended batteries. Itasca, IL: Riverside.
Assessment, 15, 123–137.
Mayer, R. E. (1992). Cognition and instruction: Their historic meet-
Kranzler, J. H., & Keith, T. Z. (1999). Independent confirmatory
ing within educational psychology. Journal of Educational
factor analysis of the Cognitive Assessment System (CAS):
Psychology, 84, 405–412.
What does the CAS measure? School Psychology Review, 28,
117–144. McGrew, K. S. (1997). Analysis of the major intelligence batteries
according to a proposed comprehensive Gf-Gc framework. In
Kranzler, J. H., Keith, T. Z., & Flanagan, D. P. (2000). Independent
D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contem-
examination of the factor structure of the Cognitive Assessment
porary intellectual assessment: Theories, tests, and issues
System (CAS): Further evidence challenging the construct valid-
(pp. 151–180). New York: Guilford Press.
ity of the CAS. Journal of Psychoeducational Assessment, 18,
143–159. McGrew, K. S., & Flanagan, D. P. (1998). The intelligence test desk
reference (ITDR): Gf-Gc cross-battery assessment. Needham
Kuhn, T. (1970). The structure of scientific revolutions (2nd ed.).
Heights, MA: Allyn and Bacon.
Chicago: University of Chicago Press.
McGrew, K. S., Werder, J. K., & Woodcock, R. W. (1991). WJ-R
Kush, J. C., Watkins, M. W., Ward, T. J., Ward, S. B., Canivez, G. L.,
technical manual. Itasca, IL: Riverside.
& Worrell, F. C. (2001). Construct validity of the WISC-III for
White and Black students from the WISC-III standardization McGrew, K. S., & Woodcock, R. W. (2001). Technical manual:
sample and for Black students referred for psychological evalua- Woodcock-Johnson III. Itasca, IL: Riverside.
tion. School Psychology Review, 30, 70–88. McIntosh, D. E. (1999). Identifying at-risk preschoolers: The dis-
Laurent, J., Swerdlik, M., & Ryburn, M. (1992). Review of validity criminant validity of the Differential Ability Scales. Psychology
research on the Stanford-Binet Intelligence Scale: Fourth Edi- in the Schools, 36, 1–10.
tion. Psychological Assessment, 4, 102–112. McNemar, Q. (1942). The revision of the Stanford-Binet scale: An
Lubin, B., Wallis, R. R., & Paine, C. (1971). Patterns of psycholog- analysis of the standardization data. Boston: Houghton Mifflin.
ical test usage in the United States: 1935–1969. Professional Meikamp, J. (1999). Review of the Das-Naglieri Cognitive
Psychology, 2, 70–74. Assessment System. In. B. S. Plake & J. C. Impara (Eds.), The
References 441
supplement to the thirteenth mental measurements year- in the Canadian normative sample. Psychological Assessment, 9,
book (pp. 75–77). Lincoln, NE: Buros Institute of Mental 512–515.
Measurements. Ryan, J. J., & Paolo, A. M. (2001). Exploratory factor analysis of the
Millon, T. (1999). Reflections on psychosynergy: A model for inte- WAIS-III in a mixed patient sample. Archives of Clinical
grating science, theory, classification, assessment, and therapy. Neuropsychology, 16, 151–156.
Journal of Personality Assessment, 72, 437–456. Saklofske, D. H., Hildebrand, D. K., & Gorsuch, R. L. (2000).
Millon, T., & Davis, R. D. (1996). Disorders of personality: Replication of the factor structure of the Wechsler Adult Intelli-
DSM-IV and beyond (2nd ed.). New York: Wiley. gence Scale—Third Edition with a Canadian sample. Psycholog-
Naglieri, J. A. (1999). Essentials of CAS assessment. New York: ical Assessment, 12, 436– 439.
Wiley. Sattler, J. M. (1988). Assessment of children (3rd ed.). San Diego,
CA: Author.
Naglieri, J. A., & Das, J. P. (1987). Construct and criterion related
validity of planning, simultaneous, and successive cognitive Snow, R. E. (1998). Abilities at aptitudes and achievements in
processing tasks. Journal of Psychoeducational Assessment, 5, learning situations. In J. J. McArdle & R. W. Woodcock (Eds.),
353–363. Human cognitive abilities in theory and practice (pp. 93–112).
Mahwah, NJ: Erlbaum.
Naglieri, J. A., & Das, J. P. (1988). Planning-Arousal-Simultaneous-
Successive (PASS): A model for assessment. Journal of School Sokal, M. M. (1981). The origins of the Psychological Corporation.
Psychology, 26, 35–48. Journal of the History of the Behavioral Sciences, 17, 54–67.
Solso, R. L., & Hoffman, C. A. (1991). Influence of Soviet scholars.
Naglieri, J. A., & Das, J. P. (1997a). Cognitive Assessment System.
American Psychologist, 46, 251–253.
Itasca, IL: Riverside.
Stanovich, K. E. (1991a). Conceptual and empirical problems with
Naglieri, J. A., & Das, J. P. (1997b). Cognitive Assessment System
discrepancy definitions of reading disability. Learning Disability
interpretive handbook. Itasca, IL: Riverside.
Quarterly, 14, 269–280.
Naglieri, J. A., & Gottling, S. H. (1995). A cognitive education
Stanovich, K. E. (1991b). Discrepancy definitions of reading dis-
approach to math instruction for the learning disabled: An indi-
ability: Has intelligence led us astray? Reading Research
vidual study. Psychological Reports, 76, 1343–1354.
Quarterly, 26, 7–29.
Naglieri, J. A., & Gottling, S. H. (1997). Mathematics instruction
Tallal, P. (2000, March 14). The science of literacy: From the labo-
and PASS cognitive processes: An intervention study. Journal of
ratory to the classroom. Proceedings of the National Academy of
Learning Disabilities, 30, 513–520.
Sciences of the United States of America, 97, 2402–2404.
Naglieri, J. A., & Johnson, D. (2000). Effectiveness of a cognitive Terman, L. M. (1916). The measurement of intelligence. Boston:
strategy intervention in improving arithmetic computation based Houghton Mifflin.
on the PASS theory. Journal of Learning Disabilities, 33, 591–
Terman, L. M. (1925). Genetic studies of genius: Vol. 1. Mental and
597.
physical traits of a thousand gifted children. Stanford, CA:
Naglieri, J. A., & Rojahn, J. (2001). Intellectual classification of Stanford University Press.
Black and White children in special education programs using
Terman, L. M., & Merrill, M. A. (1937). Measuring intelligence: A
the WISC-III and the Cognitive Assessment System. American
guide to the administration of the new revised Stanford-Binet
Journal on Mental Retardation, 106, 359–367.
tests of intelligence. Boston: Houghton Mifflin.
Paolitto, A. W. (1999). Clinical validation of the Cognitive Assess- Terman, L. M., & Merrill, M. A. (1960). Stanford-Binet Intelligence
ment System with children with ADHD. ADHD Report, 7, 1–5. Scale: Manual for the third revision. Form L-M. Boston:
Pressley, M. P., & Woloshyn, V. (1995). Cognitive strategy in- Houghton Mifflin.
struction that really improves children’s academic performance Terman, L. M., & Merrill, M. A. (1973). Stanford-Binet Intelligence
(2nd ed.). Cambridge, MA: Brookline Books. Scale: 1973 norms edition. Boston: Houghton Mifflin.
Prifitera, A., Weiss, L. G., & Saklofske, D. H. (1998). The WISC-III Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). The Stanford-
in context. In A. Prifitera & D. Saklofske (Eds.), WISC-III clini- Binet intelligence scale: Fourth edition. Itasca, IL: Riverside.
cal use and interpretation: Scientist-practitioner perspectives Thorndike, R. M. (1990). Would the real factors of the Stanford-
(pp. 1–38). San Diego, CA: Academic Press. Binet Fourth Edition please come forward? Journal of Psycho-
The Psychological Corporation. (1997). WAIS-III–WMS-III techni- educational Assessment, 8, 412–435.
cal manual. San Antonio, TX: Author. Thorndike, R. M., & Lohman, D. F. (1990). A century of ability
Reinehr, R. C. (1992). Review of the Differential Abilities Scales. In testing. Itasca, IL: Riverside.
J. J. Kramer & J. C. Conoley (Eds.), The eleventh mental mea- Thurstone, L. L. (1938). Primary mental abilities. Chicago: Univer-
surements yearbook (pp. 282–283). Lincoln, NE: Buros Institute sity of Chicago Press.
of Mental Measurements. Tupa, D. J., Wright, M. O., & Fristad, M. A. (1997). Confirmatory
Roid, G. H., & Worrall, W. (1997). Replication of the Wechsler factor analysis of the WISC-III with child psychiatric inpatients.
Intelligence Scale for Children—Third edition four-factor model Psychological Assessment, 9, 302–306.
442 Assessment of Intellectual Functioning
Undheim, J. O. (1981). On intelligence: II. A neo-Spearman model Wechsler, D. (Speaker). (1976, January). Unpublished interview
to replace Cattell’s theory of fluid and crystallized intelligence. with David Wechsler [Transcript]. San Antonio, TX: The
Scandinavian Journal of Psychology, 22, 181–187. Psychological Corporation.
Wasserman, J. D., & Becker, K. A. (2000, August). Racial and Wechsler, D. (1991). Wechsler Intelligence Scale for Children—
ethnic group mean score differences on intelligence tests. In J. A. Third Edition manual. San Antonio, TX: The Psychological
Naglieri (Chair), Making assessment more fair—taking verbal Corporation.
and achievement out of ability tests. Symposium conducted at Wechsler, D. (1997). Wechsler Adult Intelligence Scale—Third
the annual meeting of the American Psychological Association, Edition: Administration and scoring manual. San Antonio, TX:
Washington, DC. The Psychological Corporation.
Wasserman, J. D., Paolitto, A. M., & Becker, K. A. (1999, November). Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence.
Clinical application of the Das-Naglieri Cognitive Assessment San Antonio, TX: The Psychological Corporation.
System (CAS) with children diagnosed with Attention-Deficit/ Weider, A. (Speaker). (1995, August). An interview with Arthur
Hyperactivity Disorders. Paper presented at the annual meeting of Weider. Unpublished manuscript. (Available from John D.
the National Academy of Neuropsychology, San Antonio, TX. Wasserman, George Mason University, 4400 University Drive,
Wechsler, D. (1925). On the specificity of emotional reactions. MSN 2C6, Fairfax, Virginia 22030-4444)
Journal of Psychology, 36, 424–426. Werner, H. (1948). Comparative psychology of mental development.
Wechsler, D. (1939). The measurement of adult intelligence. New York: International Universities Press.
Baltimore: Williams and Wilkins. Witt, J. C., & Gresham, F. M. (1985). Review of the Wechsler
Wechsler, D. (1949). Wechsler Intelligence Scale for Children man- Intelligence Scale for Children—Revised. In J. V. Mitchell (Ed.),
ual. New York: The Psychological Corporation. Ninth mental measurements yearbook (pp. 1716–1719). Lincoln:
Wechsler, D. (1955). Wechsler Adult Intelligence Scale manual. University of Nebraska Press.
New York: The Psychological Corporation. Wolf, T. H. (1973). Alfred Binet. Chicago: University of Chicago
Wechsler, D. (1958). Intelligence et fonction cérébrale. Revue de Press.
Psychologie Appliquee, 8, 143–147. Woodcock, R. W. (1998). The WJ-R and Batería-R in neuropsycho-
Wechsler, D. (1961). Intelligence, memory, and the aging process. In logical assessment (Research Rep. No. 1). Itasca, IL: Riverside.
P. Hoch & J. Zubin (Eds.), Psychopathology of aging (pp. 152– Woodcock, R. W., McGrew, K. S., & Mather, N. (2001a). Woodcock-
159). New York: Grune and Stratton. Johnson III Tests of Cognitive Abilities. Itasca, IL: Riverside.
Wechsler, D. (1967). Wechsler Preschool and Primary Scale of Woodcock, R. W., McGrew, K. S., & Mather, N. (2001b). Woodcock-
Intelligence. New York: The Psychological Corporation. Johnson III Tests of Achievement. Itasca, IL: Riverside.
Wechsler, D. (1974). Cognitive, conative, and non-intellective intel- World Health Organization. (1992). The ICD-10 classification of
ligence. In D. Wechsler (Ed.), Selected papers of David Wechsler mental and behavioral disorders: Clinical descriptions and
(pp. 39–48). New York: Academic Press. (Original work pub- diagnostic guidelines. Geneva, Switzerland: Author.
lished 1950)