0% found this document useful (0 votes)

39 views

Flanagan, D. & Caltabiano, B. (2004) - Test Scores - A Guide To Understanding and Using Tests Results.

This document provides definitions and explanations of common terms used to describe student performance on standardized tests. It discusses standard scores, percentiles, confidence intervals, subtest scores, and composite scores. It emphasizes that standardized test scores should be interpreted within the context of a student's performance across multiple subtests and compared to either age or grade norms, depending on the type of test. Overall scores may not accurately reflect a student's abilities if their performance is uneven across different subtests.

Uploaded by

mariocordoba06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views

Flanagan, D. & Caltabiano, B. (2004) - Test Scores - A Guide To Understanding and Using Tests Results.

Uploaded by

mariocordoba06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/237626627

TEST SCORES: A GUIDE TO UNDERSTANDING AND USING TEST RESULTS

Article · January 2004

CITATIONS READS
2 21,135

2 authors, including:

Dawn P. Flanagan
St. John's University
75 PUBLICATIONS 3,251 CITATIONS

SEE PROFILE

All content following this page was uploaded by Dawn P. Flanagan on 16 May 2014.

The user has requested enhancement of the downloaded file.

TEST SCORES: A GUIDE TO UNDERSTANDING
AND USING TEST RESULTS
By Dawn P. Flanagan, PhD, & Lenny F. Caltabiano
St. John’s University

When a student takes either an individually or group-administered standardized test at school, the
results are made available to both parents and teachers. It is important that parents and teachers
understand the meaning of scores that come from standardized tests. This handout provides a
description of common terms used to describe test performance. You are also encouraged to refer to
handouts on Psychological Reports (Flanagan & Caltabiano) and Intellectual Assessment (Ortiz & Lella)
to gain a better understanding of the evaluation process (See “Resources”).

Frequently Used Terms

The results of most psychological tests are reported using either standard scores or percentiles.
Standard scores and percentiles describe how a student performed on a test compared to a
representative sample of students of the same age from the general population. This comparison sample
or group is called a norm group. Because educational and psychological tests do not measure abilities
and traits perfectly, standard scores are usually reported with a corresponding confidence interval to
account for error in measurement.

Standard Score
Most educational and psychological tests provide standard scores that are based on a scale that has
a statistical mean (or average score) of 100. If a student earns a standard score that is less than 100,
then that student is said to have performed below the mean, and if a student earns a standard score that
is greater than 100, then that student is said to have performed above the mean. However, there is a
wide range of average scores, from low average to high average, with most students earning standard
scores on educational and psychological tests that fall in the range of 85–115. This is the range in which
68% of the general population performs and, therefore, is considered the normal limits of functioning.
Classifying standard scores. However, the normal limits of functioning encompass three
classification categories: low average (standard scores of 80–89), average (standard scores of 90–109),
and high average (110–119). These classifications are used typically by school psychologists and other
assessment specialists to describe a student’s ability compared to same-age peers from the general
population.
Subtest scores. Many psychological tests are composed of multiple subtests that have a mean of 10,
50, or 100. Subtests are relatively short tests that measure specific abilities, such as vocabulary, general
knowledge, or short-term auditory memory. Two or more subtest scores that reflect different aspects of
the same broad ability (such as broad Verbal Ability) are usually combined into a composite or index
score that has a mean of 100. For example, a Vocabulary subtest score, a Comprehension subtest score,
and a General Information subtest score (the three subtest scores that reflect different aspects of Verbal
Ability) may be combined to form a broad Verbal Comprehension Index score. Composite scores, such as
IQ scores, Index scores, and Cluster scores, are more reliable and valid than individual subtest scores.
Therefore, when a student’s performance demonstrates relatively uniform ability across subtests that
measure different aspects of the same broad ability (the Vocabulary, Comprehension, and General
Information subtest scores are both average), then the most reliable and valid score is the composite
score (Verbal Comprehension Index in this example). However, when a student’s performance
demonstrates uneven ability across subtests that measure different aspects of the same broad ability
(the Vocabulary score is below average, the Comprehension score is below average, and the General
Information score is high average), then the Verbal Comprehension Index may not provide an accurate
estimate of verbal ability. In this situation, the student’s verbal ability may be best understood by

Helping Children at Home and School II: Handouts for Families and Educators S2–81
looking at what each subtest measures. In sum, it is Tests that are highly reliable have relatively small
important to remember that unless performance is confidence bands associated with their scores,
relatively uniform on the subtests that make up a indicating that these tests provide the most consistent
particular broad ability domain (such as Verbal Ability), scores across time.
then the overall score (in this case the Verbal
Comprehension Index) may be a misleading estimate. Example: Reporting Scores
The following statement is one that can be
Percentile commonly found in a psychological report and can be
Standard scores may also be reported with a used to illustrate these definitions: “Jacob obtained a
percentile to aid in understanding performance. A standard score of 93 + 7 on a test of reading
percentile indicates the percentage of individuals in the comprehension, which is ranked at the 33rd percentile
norm group that scored below a particular score. For and is classified as average.” This is what that
example, a student who earned a standard score of 100 statement means: First, Jacob’s observed score fell
performed at the 50th percentile. This means that the below the mean of 100. Second, Jacob did as well as or
student performed as well as or better than 50% of better than 33% of students his age from the general
same-age peers from the general population. A standard population. Third, there is a 95% chance that Jacob’s
score of 90 has a percentile rank of 25. A student who is true score falls somewhere between 86 and 100. Fourth,
reported to be at the 25th percentile performed as well or Jacob’s performance is considered average relative to
better than 25% of same-age peers, just as a student same-age peers from the general population. The table
who is reported to be at the 75th percentile performed as at the end of this handout provides commonly used
well or better than 75% of students of the same age. performance classifications for standard scores and
While the standard score of 90 is below the statistical percentiles.
mean of 100 and is at the 25th percentile, this
performance is still within the average range and Understanding the Assessment Report
generally does not indicate any need for concern. Type of norms used. It is important to take note of
the types of norms used when reading test results in a
Confidence Interval psychological or school assessment report. A student’s
Psychological tests do not measure ability perfectly. performance on a standardized test can be compared to
No matter how carefully a test is developed, it will always other students of the same age (age norms) or of the
contain some form of error or unreliability. This error may same grade (grade norms). Age norms are always used
exist for various reasons that are not always readily for tests of intellectual ability so that comparisons can
identifiable. In order to account for this error, standard be made to same-age peers. The use of grade norms is
scores are often reported with confidence intervals. related to the type of test being utilized or may be
Confidence intervals represent a range of standard dictated by certain situations. For example, grade norms
scores in which the student’s true score is likely to fall a may be most appropriate for achievement tests when a
certain percentage of the time. Most confidence student has repeated a grade and to see how the
intervals are set at 95%, meaning that a student’s true student’s performance compares to grade-level peers.
score is likely to fall between the upper and lower limits Use of age or grade equivalents. Age and grade
of the confidence interval 95 out of 100 times (or 95% equivalents are different from age and grade norms.
of the time). For example, if a student earned a standard Essentially, the age and grade equivalents are scores
score of 90 with a confidence interval of +5, this means that indicate the typical age or grade level of students
that the lower limit of the confidence interval is 85 (that who obtain a given score. For example, if Jacob’s
is, 90 – 5 = 85) and the upper limit of the confidence performance on the test of reading comprehension is
interval is 95 (90 + 5 = 95). The standard score of 90 equal to an age equivalent of 8.7 years and a grade
may be reported in a psychological report as 90 + 5 or equivalent of 2.6, this means that his obtained raw score
90 (85 – 95). Although the student’s score on the day of is equivalent to the same number of items correct that is
the evaluation was 90 in this example, the true score average for all 8-year, 7-month old children included in
may be lower or higher than 90 owing to an error the norm group on that particular reading
associated with the method in which the ability was comprehension test. Additionally Jacob’s score is
measured. Therefore, it is more accurate to say that equivalent to the average reading comprehension
there is a 95% chance that the student’s true performance of all children included in the normative
performance on this test falls somewhere between 85 sample who were in the sixth month of second grade.
and 95. The age or grade equivalents do not mean that Jacob is

S2–82 Test Scores: A Guide to Understanding and Using Test Results

functioning on an 8-year-old, mid-second grade level. http://marketplace.psychcorp.com (See Resource
Remember that Jacob’s standard score of 93 is Center, About Testing)
classified as average and falls within the normal range Ortiz, S., & Lella, S. (2004). Intellectual assessment and
of functioning. Consequently, it is always important to cognitive abilities: Basics for parents and educators.
make decisions and interpretations about normal In A. Canter, L. Paige, M. Roth, I. Romero, & S.
functioning using standard scores, not age and grade Carroll (Eds.), Helping children and home and school
equivalents. II: Handouts for families and educators. Bethesda,
Validity of scores. Reports of assessment results MD: National Association of School Psychologists.
typically include a statement as to the validity—or Wright, P. D., & Wright, P. D (2000). Understanding tests
accuracy—of the test scores. There are many factors and measurement for the parent and advocate.
that can influence a student’s test performance. These Available: www.ldonline.org/ld_indepth/assessment/
factors may include, but are not limited to, behavior tests_measurements.html
during testing, the presence of distractions during
testing, the student’s cultural and linguistic background, References for the Table
and the student’s physical health at the time of testing. Flanagan, D., & Ortiz, S. (2001). Essentials of cross-
An educational or psychological test report should battery assessment. New York: Wiley.
indicate whether any of these factors were present and Flanagan, D., Ortiz, S., Alfonso, V., & Moscolo, J. (2002).
how they may have affected the results of the test, The achievement test desk reference: Comprehensive
thereby compromising the validity of the findings. assessment and learning disabilities. Boston:
Typically, this information, appearing in the Behavioral Pearson, Allyn & Bacon.
Observations section of a psychological report, aids in Woodcock, R. W., & Mather, N. (1989). WJ—R Test of
assessing the validity and usefulness of the test Cognitive Ability—Standard and Supplemental
findings. If the school psychologist did not observe any Batteries: Examiner’s manual. In R. W. Woodcock &
unusual behaviors during testing and if no other factors, M. B. Johnson (Eds.), Woodcock-Johnson Psycho-
internal (lack of motivation, depressed mood, fatigue) or Education Battery—Revised. Chicago: Riverside.
external (loud voices outside the testing room), were
believed to have had an adverse affect on test Websites
performance, then the psychologist’s statement about Harcourt Assessment—
the validity of the findings may be like this: “Overall, the http://marketplace.psychcorp.com
current test results appear to represent a valid estimate
of Jacob’s cognitive and academic functioning.” This Dawn P. Flanagan, PhD, is a Professor and Coordinator of
statement assists the reader in determining whether the the School Psychology program at St. John’s University,
results from the psychological tests administered to the Jamaica, NY. Lenny F. Caltabiano is a doctoral student in
student may be used confidently to make diagnostic and school psychology at St. John’s University.
educational decisions.
© 2004 National Association of School Psychologists, 4340 East West Highway,
Suite 402, Bethesda, MD 20814—(301) 657-0270.
Summary
When parents and teachers better understand the
meaning of scores from educational or psychological
evaluations, they are able to better plan to meet student
needs. Additional information is available in the
“Resources” below, and from the assessment
professionals at your school, such as the school
psychologist or counselor.

Resources
Flanagan, D., & Caltabiano, L. (2004). Psychological
reports: A guide for parents and teachers. In A.
Canter, L. Paige, M. Roth, I. Romero, & S. Carroll
(Eds.), Helping children and home and school II:
Handouts for families and educators. Bethesda, MD:
National Association of School Psychologists.
Harcourt Assessment (n.d.). Some things parents should
know about testing. Available:

Helping Children at Home and School II: Handouts for Families and Educators S2–83
Classifying Test Scores

Result Classification of Performance

Standard score range Percentile rank range Descriptive Normative

>131 98–99+ Very superior Normative strength; 16% of the

121–130 92–97 Superior population
116–120 85–97 Above average

111–115 76–84 High average Normal limits; 68% of the population

90–110 25–75 Average
85–89 16–24 Low average

80–84 9–15 Below average Normative weakness; 16% of the

70–79 3–8 Deficient population
< 69 <2 Very deficient

Note. Classifications are based on those described in Flanagan and Ortiz (2001) and Flanagan, Ortiz, Alfonso, and Mascolo (2002)
and were adapted from Woodcock and Mather (1989)

S2–84 Test Scores: A Guide to Understanding and Using Test Results

View publication stats

Descrptive Statistics and The Normal Curve - ED
No ratings yet
Descrptive Statistics and The Normal Curve - ED
7 pages
Mathematics Assignment PDF
No ratings yet
Mathematics Assignment PDF
16 pages
CASE STUDY - Smart Teams and Dumbs Teams.
75% (4)
CASE STUDY - Smart Teams and Dumbs Teams.
3 pages
2 5 59 673 PDF
No ratings yet
2 5 59 673 PDF
4 pages
Test Scores A Guide To Understanding
No ratings yet
Test Scores A Guide To Understanding
4 pages
Short Summary
No ratings yet
Short Summary
15 pages
TEST CONSTRUCTION AND INTERPRETATION Oficial PDF
No ratings yet
TEST CONSTRUCTION AND INTERPRETATION Oficial PDF
36 pages
Insights From AGS Development Scores of Scores-The Art of Keeping Them Straight!
No ratings yet
Insights From AGS Development Scores of Scores-The Art of Keeping Them Straight!
3 pages
UNIT 2 Psych. Testing
No ratings yet
UNIT 2 Psych. Testing
13 pages
Glossary of Testing, Measurement, and Statistical Terms: 425 Spring Lake Drive - Itasca, IL 60143
No ratings yet
Glossary of Testing, Measurement, and Statistical Terms: 425 Spring Lake Drive - Itasca, IL 60143
41 pages
Written Report On Types of Scores and What They Mean
No ratings yet
Written Report On Types of Scores and What They Mean
7 pages
Inerpreting Test Scores
0% (1)
Inerpreting Test Scores
2 pages
Different Types of Scoring Systems
No ratings yet
Different Types of Scoring Systems
10 pages
WJ IV Report Table Shell
100% (1)
WJ IV Report Table Shell
12 pages
Understanding Psychological Test Scores
No ratings yet
Understanding Psychological Test Scores
2 pages
Assessment
No ratings yet
Assessment
31 pages
Refresher Psychomet 1
No ratings yet
Refresher Psychomet 1
132 pages
Psychological Tests and Scales
No ratings yet
Psychological Tests and Scales
6 pages
Ijmra 13177
No ratings yet
Ijmra 13177
11 pages
Chapter Psychometric Foundations
No ratings yet
Chapter Psychometric Foundations
34 pages
Chapter 3 A Statistical Refresher
No ratings yet
Chapter 3 A Statistical Refresher
8 pages
Psychological Research Unit - 4
No ratings yet
Psychological Research Unit - 4
112 pages
Psychological Test Construction
No ratings yet
Psychological Test Construction
10 pages
Activity 1 Test and Measurement
No ratings yet
Activity 1 Test and Measurement
5 pages
Psy407 Handout 6
No ratings yet
Psy407 Handout 6
8 pages
Gose Educ 105
No ratings yet
Gose Educ 105
19 pages
Aranda Bpsy 198 Glossary
No ratings yet
Aranda Bpsy 198 Glossary
12 pages
Epy Topic 11 Notes
No ratings yet
Epy Topic 11 Notes
9 pages
Norms and Basic Statistics On Psychometrics and Psychological Testing
No ratings yet
Norms and Basic Statistics On Psychometrics and Psychological Testing
17 pages
Psychological Standardized Test
No ratings yet
Psychological Standardized Test
22 pages
Statistical Methods Internal Examination
No ratings yet
Statistical Methods Internal Examination
4 pages
Powerpoint Ped 129 Research in Physical Education 1
No ratings yet
Powerpoint Ped 129 Research in Physical Education 1
43 pages
Norms
No ratings yet
Norms
3 pages
6. Characteristics of Tests
No ratings yet
6. Characteristics of Tests
4 pages
Utilization of Assessment Data
100% (1)
Utilization of Assessment Data
21 pages
B.ED PS43092T - Unit IV - Achievement TestB - Ed Semester. Semester III
No ratings yet
B.ED PS43092T - Unit IV - Achievement TestB - Ed Semester. Semester III
8 pages
Presentation (1)
No ratings yet
Presentation (1)
29 pages
Testing A Test
No ratings yet
Testing A Test
4 pages
Achievement Test
No ratings yet
Achievement Test
28 pages
Achievement Test
No ratings yet
Achievement Test
7 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
26 pages
Learning Assessment by DR Vivek
No ratings yet
Learning Assessment by DR Vivek
16 pages
Psychometrics
No ratings yet
Psychometrics
52 pages
Scoring and Interpretation of Test Scores
No ratings yet
Scoring and Interpretation of Test Scores
13 pages
View PDF
No ratings yet
View PDF
48 pages
Introduction To Psychological Testing
No ratings yet
Introduction To Psychological Testing
5 pages
Achievement
No ratings yet
Achievement
18 pages
Test Items: Computer
No ratings yet
Test Items: Computer
10 pages
2 P 1 Assessment For Learning
No ratings yet
2 P 1 Assessment For Learning
42 pages
Norms in Psychological Tests and Its Types
87% (15)
Norms in Psychological Tests and Its Types
4 pages
Types of Scores in Assessment
0% (1)
Types of Scores in Assessment
6 pages
Psychological Assessment HW #3
No ratings yet
Psychological Assessment HW #3
7 pages
Classify The Following Research Designs As Descriptive or Experimental
No ratings yet
Classify The Following Research Designs As Descriptive or Experimental
7 pages
Reliability and Validity of The Research Methods Skills Assessment
No ratings yet
Reliability and Validity of The Research Methods Skills Assessment
11 pages
Gallenberg Paper 1
No ratings yet
Gallenberg Paper 1
9 pages
Assessment August 24 2019
No ratings yet
Assessment August 24 2019
92 pages
Devanshee Ashar
No ratings yet
Devanshee Ashar
5 pages
Interpreting Standardized Test Scores Book Chapter
No ratings yet
Interpreting Standardized Test Scores Book Chapter
4 pages
Testing Impact Review
From Everand
Testing Impact Review
Mason Ross
No ratings yet
The Complete ISEE Upper Level Test Prep Book: Over 3000 Practice Questions to Help You Pass Your Exam
From Everand
The Complete ISEE Upper Level Test Prep Book: Over 3000 Practice Questions to Help You Pass Your Exam
Caleb Roster
No ratings yet
Passing Exams with Confidence Strategies for Study Habit Improvement
From Everand
Passing Exams with Confidence Strategies for Study Habit Improvement
Daniel Ortega
5/5 (1)
Essay Assignments: A user-friendly guide
From Everand
Essay Assignments: A user-friendly guide
Janine Gee
No ratings yet
Comparative Matrix: Cynthia Paula E. Sumalikwa Theories of Personality
No ratings yet
Comparative Matrix: Cynthia Paula E. Sumalikwa Theories of Personality
1 page
Dictionary of Social Work
100% (1)
Dictionary of Social Work
40 pages
Bias in Research
No ratings yet
Bias in Research
5 pages
Lesson 1-Q2-Diss
No ratings yet
Lesson 1-Q2-Diss
54 pages
CH 33 Constructive Communication
No ratings yet
CH 33 Constructive Communication
10 pages
Test Review: Vocational Preference Inventory: Rehabilitation Counseling Bulletin March 2013
No ratings yet
Test Review: Vocational Preference Inventory: Rehabilitation Counseling Bulletin March 2013
4 pages
Affective Domain Holistic Rubric
No ratings yet
Affective Domain Holistic Rubric
5 pages
14 - Stigma, Budaya, Dan Literasi Kesehatan Mental
No ratings yet
14 - Stigma, Budaya, Dan Literasi Kesehatan Mental
27 pages
Complete Download of Test Bank for Psychology and the Challenges of Life: Adjustment and Growth, 13th Edition Jeffrey S. Nevid Spencer A. Rathus Full Chapters in PDF DOCX
100% (21)
Complete Download of Test Bank for Psychology and the Challenges of Life: Adjustment and Growth, 13th Edition Jeffrey S. Nevid Spencer A. Rathus Full Chapters in PDF DOCX
81 pages
Leadership Project
100% (1)
Leadership Project
75 pages
Anderssen Et Al-2002-Scandinavian Journal of Psychology
No ratings yet
Anderssen Et Al-2002-Scandinavian Journal of Psychology
18 pages
Maslow's Safety and Security (A2)
No ratings yet
Maslow's Safety and Security (A2)
3 pages
Research Chapter 1 and 2 FINAL
0% (1)
Research Chapter 1 and 2 FINAL
22 pages
The Instructional Design Process
No ratings yet
The Instructional Design Process
14 pages
Toxic Stress A Slow Wear and Tear
100% (1)
Toxic Stress A Slow Wear and Tear
5 pages
Written Assignment Unit 7 Univ 1001
No ratings yet
Written Assignment Unit 7 Univ 1001
2 pages
09 TSCC-SF (1)
No ratings yet
09 TSCC-SF (1)
5 pages
Mental Health Perspectives
No ratings yet
Mental Health Perspectives
19 pages
Counselling
100% (2)
Counselling
37 pages
Rensselaer Area Mental Health Counseling Providers Fall 17
No ratings yet
Rensselaer Area Mental Health Counseling Providers Fall 17
8 pages
GROUP 3 Dominant Effects of Socio Economic Status On The Academic Performance of Students
No ratings yet
GROUP 3 Dominant Effects of Socio Economic Status On The Academic Performance of Students
32 pages
Kelly Gerhart: Work Experience
No ratings yet
Kelly Gerhart: Work Experience
3 pages
Course Selection 2010-2011
No ratings yet
Course Selection 2010-2011
1 page
Assignment 4-2 SRV Implementation Plan
No ratings yet
Assignment 4-2 SRV Implementation Plan
9 pages
KAPLAN Study Guide Ninth Edition-1 (001-100) PDF
100% (3)
KAPLAN Study Guide Ninth Edition-1 (001-100) PDF
100 pages
1. Introduction to OB Pptx
No ratings yet
1. Introduction to OB Pptx
15 pages
Case Study
No ratings yet
Case Study
10 pages
Chapter 1
No ratings yet
Chapter 1
15 pages

Flanagan, D. & Caltabiano, B. (2004) - Test Scores - A Guide To Understanding and Using Tests Results.

Uploaded by

Flanagan, D. & Caltabiano, B. (2004) - Test Scores - A Guide To Understanding and Using Tests Results.

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

TEST SCORES: A GUIDE TO UNDERSTANDING AND USING TEST RESULTS

Article · January 2004

The user has requested enhancement of the downloaded file.

Frequently Used Terms

S2–82 Test Scores: A Guide to Understanding and Using Test Results

Result Classification of Performance

>131 98–99+ Very superior Normative strength; 16% of the

111–115 76–84 High average Normal limits; 68% of the population

80–84 9–15 Below average Normative weakness; 16% of the

S2–84 Test Scores: A Guide to Understanding and Using Test Results

View publication stats

You might also like