0% found this document useful (0 votes)
55 views

PSYC 326 Lecture 3 NEW (Compatibility Mode)

This document discusses different methods of interpreting test scores, including raw scores, transformed scores, and various types of norms. It explains that raw scores are simply the number of points received on a test, while transformed scores put raw scores on a scale with defined characteristics to allow for normative interpretation. The document outlines criterion-referenced interpretation, which describes specific tasks based on performance standards, and norm-referenced interpretation, which indicates an individual's relative position compared to a reference group. It also discusses different types of norms like age norms, grade norms, and percentile norms that are used to interpret test scores.

Uploaded by

FELIX ADDO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

PSYC 326 Lecture 3 NEW (Compatibility Mode)

This document discusses different methods of interpreting test scores, including raw scores, transformed scores, and various types of norms. It explains that raw scores are simply the number of points received on a test, while transformed scores put raw scores on a scale with defined characteristics to allow for normative interpretation. The document outlines criterion-referenced interpretation, which describes specific tasks based on performance standards, and norm-referenced interpretation, which indicates an individual's relative position compared to a reference group. It also discusses different types of norms like age norms, grade norms, and percentile norms that are used to interpret test scores.

Uploaded by

FELIX ADDO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

TEST SCORE INTERPRETATION

Paul Doku, PhD


Aims
• By the end of this class, students will be able
to
– Explain the differences between raw and
transformed scores.
– Differentiate between norm and criterion
referenced test score interpretation.
– Explain various methods of transforming test
scores.
Raw scores
• A raw score is a number that summarizes or
captures some aspects of a person’s
performance in the carefully selected and
observed behaviour samples that make up
psychological tests.
• A raw score is simply the number of points
(marks) a person receives on a test when the
test is scored according to the directions
• Making sense out of test results, that is,
making test results meaningful is largely a
matter of transforming or converting the raw
scores into more interpretable and useful
forms of information.
Psychological Test Interpretation
• Test interpretation is a process of assigning meaning
and usefulness to the scores obtained from tests or
assessment
• For instance, a score of 50% in one mathematics
test cannot be said to be better than a score of 40%
obtained by same examinee in another
mathematics test
• Test scores are not based on the same standard of
measurement and as such meaning cannot be read
into the scores on the basis of which academic and
psychological decisions may be taken
Frames of reference in test score
interpretation.
• In general, we can provide meaning to a raw score by two
main ways.
• First, by converting it into a description of the specific tasks
that the person can perform that is, criterion-referenced
interpretation.
• Second, by converting it into some type of derived score
that indicates the person’s relative position in a clearly
defined reference group, that is, norm-referenced
interpretation.
• In some cases, both types of interpretation may be
appropriate and useful.
Criterion-referenced test interpretation
• Criterion - referenced interpretation is the
interpretation of test raw score based on the
conversion of the raw score into a description of
the specific tasks that the learner can perform
• A score is given meaning by comparing it with the
standard of performance that is set before the test
is given
• It permits the description of a learner’s test
performance without referring to the performance
of others
Examples of criterion referenced
interpretation are

• Types 60 words per minute without error (speed)


• Driving test
• Answers 70% of exams questions correctly
(percentage-correct score)
• For Criterion-referenced test to be meaningful,
the test has to be specifically designed to measure
a set of clearly stated learning tasks
Criterion-referenced test interpretation
• When the relationship between the items or
tasks of a test and standards of performance is
demonstrable and well defined, test scores may
be evaluated via criterion-referenced
interpretation.

• In criterion – referenced tests, we set standards


for judging whether a person has mastered
achievement domains or performance. The
standard of performance or behaviour is known
as the criterion.
Norm-Referenced Interpretation
• Norm – referenced interpretation is the
interpretation of raw score based on the conversion
of the raw score into some type of derived score
that indicates the learner’s relative position in a
clearly defined referenced Group

• This type of interpretation reveals how a learner


compares with other learners who have taken the
same test
Norm-Referenced Interpretation
• First, examinees’ raw scores are ranked from
highest to lowest scores
• Secondly, the position of an individual’s score is
compared to that of other examinees in the test
• In Norm-referenced interpretation, what is
important is a sufficient spread of test scores to
provide reliable ranking
• The percentage score or the relative easy /difficult
nature of the test is not necessarily important in the
interpretation of test scores in terms of relative
performance
Norm-referenced test interpretation

• It uses standards based on the performance of


specific groups of people to provide
information for interpreting scores.

• This type of test interpretation is useful when


we need to compare individuals with one
another or with a reference group in order to
evaluate differences between them on
whatever characteristics the test measures.
• Simple ranking of students scores in a class
test is an example of norm-referenced test
interpretation.
• Although the simplest ranking of raw scores may
be useful for reporting the results of a classroom
test, it is of limited value beyond the immediate
situation because the meaning of a given rank
depends on the number of group members.
• To obtain a more general framework, that is rules
and ideas that will guide us to make meaningful
and effective norm – referenced interpretation,
we convert or transform raw scores into some
type of derived score.
Derived or transformed scores
• A derived score or a transformed score is a
numerical report of test performance on a
scale that has well-defined characteristics and
yields normative meaning.
Norms
• Norms are reference frames on which interpretation of test
scores are based
• They represent the typical performance of the examinees in
the reference frame on which the test raw scores were
standardized
• This is done by the administration of the test to a
representative sample of the examinees for whom the test
was designed
• The comparisons of the test scores with these reference
frames make it possible to predict a learner’s probable
success in various areas
• E.g. the diagnosis of strength and weakness, measuring
educational growth and the use of the test results for other
instructional guidance purposes
• Norms: refers to the test performance or typical
behaviour of one or more reference groups.

• Norms are usually presented in the form of


tables with descriptive statistics – means,
standard deviations etc – that summarize the
performance of the group or groups in
question.

• Gathering norms is a central aspect of the


process of standardizing a norm referenced
test.
• The standardization sample is the group of
individuals on whom a test is originally
standardized in terms of administration and
scoring procedures, as well as in developing
the test norms.
• The normative sample is synonymous with the
standardization sample, but can refer to any
group from which norms are gathered.
• Test norms enable us to compare a person’s
performance with that of other people. In
general, norms indicate an examinee’s
standing on test relative to the performance
of other persons of the same age, grade, sex,
and so on. Test norms enable us to interpret
and use test results effectively.
• The most common types of test norms are:
grade norms; age norms; percentile norms /
percentile ranks / percentile scores; and
standard scores.
Age norms
• An age norm depicts the level of performance for each
separate age group in the normative sample.

• Age norms are based on the average scores earned by


pupils at various age levels and are interpreted in
terms of age level or age equivalents.

• A test may be prepared with a certain age or range of


ages of examinees in mind
• Age norms are mostly used in elementary schools in
areas of mental ability test, personality test, reading
test and interest inventories where growth pattern tend
to be consistent
Age Norms
• Performance typical for specific ages in these tests are
determined and the performance of an individual of the
age group is compared against the typical performance.
• The typical performance is determined by giving the test to
a very large sample of the population for whom the test is
meant and then after scoring, find the mean score of the
large sample.
• This average score then becomes the typical performance
for the age.
• The performance of a student of this age in the test could
then be interpreted to be higher or lower than or the same
as the average performance for the age.
Age norms
• The purpose of age norms is to facilitate
same-aged comparisons. With age norms, the
performance of an examinee is interpreted in
relation to standardisation subjects of the
same age.
Grade norm
• A grade norm depicts the level of test
performance for each separate grade in the
normative sample. Grade in this context may
refer to stages or classes such as primary one,
primary two, JSS 1, etc.
• Grade norms are rarely used with ability tests.
However, these norms are especially useful in
school settings when reporting the achievement
levels of school children especially in
basic/elementary school level.
Grade norm
• They are based on the average scores earned by
pupils in each of a series of grades and are
interpreted in terms of grade equivalent or
average.
• Since academic achievement in many content
areas is heavily dependent on grade-based
curricular exposure, comparing a student against
a normative sample from the same grade is more
appropriate than using age-based comparison
• Grade norms are prepared for traits that show a progressive
and relatively uniform increase from one school grade (class)
to the next higher grade
• The norm for any grade (class) is then the average score
obtained by individuals in that grade
• The process of establishing grade norms involves giving the
test to a representative sample of pupils in each of a number
of consecutive grades to evaluate the (mean) performance
of individuals in their respective specific grades (classes) of
the school system
• This is achieved by limiting the content and objectives of the
test to the content and objectives relevant to the class
• The mean score obtained becomes the typical score of the
grade (class) against which the performance of the members
of the grade can be compared and are interpreted
Local norms
• Local norms are derived from representative
local examinees, as opposed to a national
sample. For example, the typical performance
of rural or ‘less endowed schools’ (local
norms) might be lower than the typical
performance of all schools in general (national
norms) in our country.
Subgroup norms
• Subgroup norms consist of the scores obtained from
an identifiable group (whites, Africans, African –
Americans, females etc.), as opposed to a diversified
national sample.
• For example, in university admissions, females and
students from ‘less endowed / rural schools) are given
special concessions on the basis that their typical
performance (norms) are different (generally lower)
than the national norms. The practice of offering
university admission concessions to ‘less privileged
groups’ is, thus based on the idea of norms.
Percentile score/ranks
• A percentile score indicates the relative
position of an individual test taker compared
to a reference group, such as the
standardization sample.

• Specifically, it represents the percentage of


persons in the reference group who scored at
or below a given raw score.
Percentile score
• A percentile rank or percentile score indicates a
person’s relative position in a group in terms of
the percentage of peoples scoring lower.
• For example if a person’s raw score of 29 equals a
percentile rank of 70, it means 70 percent of the
people who took the test obtained a raw score
lower than 29.
• Stating it another way, the person’s performance
is better than 70 percent of the group that took
the test. This is denoted as P
70.
Percentile score
• Percentiles are different from percentage
scores.
• Percentage scores reflect the number of
correct responses that an individual obtains
out of the total possible number of correct
responses on a test.
• The frame of reference with percentage scores
is the content of the entire test whereas with
percentile scores, it is other people.
Percentiles score
• Higher percentiles indicate higher scores.
• In the extreme case, an examinee who
obtained a raw score that exceeded every
score in the group would receive a percentile
rank of 100, or P .
100

• Let us not confuse percentiles with percent


correct.
Percentile score
• We can also view percentiles as ranks in a
group of 100 representative subjects, with 1
being the lowest rank and 100 the highest.
• Note that percentile ranks are the complete
reverse of usual ranking procedures.
• A percentile rank (PR) of 1 is at the bottom of
the sample, while a PR of 99 is near the top.
Percentiles
• A percentile of 50 (P ) corresponds to the
50

median or middlemost raw score.


• A percentile of 25 (P ) is often denoted as Q1
25

or the first quartile because one-quarter of


the scores fall below this point.
• In like manner, a percentile of 75 (P ) is
75

referred to as Q3 or the third quartile because


three –quarters of the scores fall below this
point.
• Advantages and disadvantages of percentiles?
Standard scores
• Another method of indicating a person’s relative
position is by showing how far the raw score is above
or below the average.
• This is the approach used with standard scores.
• Basically, standard scores express test performance in
terms of standard deviation units from the mean.
• The mean (M) is the arithmetical average, which is
determined by adding all of the scores and dividing by
the number of scores.
• The standard deviation (SD) is a measure of the spread
of scores
Standard scores
• Standard score uses the standard deviation of
the total distribution of raw scores as the
fundamental unit of measurement.
• Also, the standard score expresses the
distance from the mean in standard deviation
units.
• For example, a raw score that is exactly one
standard deviation unit from the mean has a
standard score of 1.00.
Standard scores
• A raw score that is exactly half a standard
deviation below the mean converts to a
standard score of -0.50.
• Thus, a standard score not only expresses the
magnitude from the mean, but the direction
of departure (positive or negative) as well.
Normal curve
• The Normal Curve / Normal Distribution Curve
and Standard Deviation Units
• The normal probability curve also called the
normal distribution curve or simply the normal
curve is a symmetrical bell shaped curve that has
many mathematical properties.
• 1. The curve is bell-shaped and bilaterally
symmetrical with the highest point at the centre.
• 2. The mean, median and mode all fall at the
centre of the curve.
Normal curve
• 3. The standard deviation is the distance from the
mean to the point of inflection (all SDs are equal
in distance along the baseline of the curve).
• 4. The percent for each area under the curve
represents percentage of cases.
• 5. Each of the scores below the curve can be
translated into any of the others such as Z-scores,
Percentile scores, T-scores, etc. (if we can assume
essentially normally distributed scores and
comparable norms).
Test A Test B
M 56 72
SD 4 6
Z score
• The z-score expresses test performance simply
and directly as the number of standard
deviation units a raw score is above or below
the mean.
Z score
• Computation of an examinee’s z-score (usually
referred to as standard score) is simple:
Subtract the mean of the normative group
from the examinee’s raw score and then
divide this difference from the standard
deviation of the normative group
Z score
z-score = X – M
SD

Where
• X = any raw score
• M = arithmetic mean of raw scores
• SD = standard deviation of raw scores
Z score
Person A : raw score of 58 (above average)
• Z = 58 – 56 = 0.50
4

Person B: raw score of 50 (below average)


• z = 50 – 56 = - 1.50
4

Person C: raw score of 56 (exactly average)


• z = 56 - 56 = 0.00
4
T score
• The term T-Score refers to any set of normally
distributed standard scores that has a mean of
50 and a standard deviation of 10.
• T – Score = 10 (X – M) + 50
SD
• T = 10z + 50

You might also like