Week 5 of Tests and Testing
Week 5 of Tests and Testing
Objectives:
After the completion of the chapter, students should be able to:
1. Discuss the myths and realities in psychological assessment
2. Understand the psychometric properties of a good test
3. Understand the nature of norms
Chapter Topics:
A. Some assumptions about psychological testing and assessment
B. What’s a good test?
C. Norms
Psychological Trait
The term psychological trait, much like the term trait alone,
covers a wide range of possible characteristics.
Among them are psychological traits that relate to
intelligence, specific intellectual abilities, cognitive style,
adjustment, interests, attitudes, sexual orientation and
preferences, psychopathology, personality in general, and
specific personality traits.
Construct A psychological trait exists only as a construct
It is an informed, scientific concept developed or
constructed to describe or explain behavior.
Overt Behavior Refers to an observable action or the product of an
observable action, including test- or assessment-related
responses.
Assumption 2: Psychological Traits and States Can Be Quantified and Measured
Measuring traits and states by means of a test entails
developing not only appropriate test items but also
appropriate ways to score the test and interpret the
results.
For many varieties of psychological tests, some number
representing the score on the test is derived from the
examinee’s responses.
Cumulative Scoring Is based on the assumption that the more often the test
taker responds in a particular direction as noted by test
manual as correct or consistent with a particular
trait/ability/state, the higher/lower the test taker is
presumed to be on the targeted ability/trait/state.
Assumption 3: Test-Related Behavior Predicts Non-Test-Related Behavior
The objective of the test is to provide some indication of
other aspects of the examinee’s behavior.
The tasks in some tests mimic the actual behaviors that
the test user is attempting to understand. By their nature,
however, such tests yield only a sample of the behavior
that can be expected to be emitted under non-test
conditions. The obtained sample of behavior is typically
used tomake predictions about future behavior, such as
work performance of a job applicant.
Assumption 4: Tests and Other Measurement Techniques Have Strengths and Weaknesses
Competent test users understand and appreciate the
limitations of the tests they use as well as how those
limitations might be compensated for by data from other
sources that test users know the tests they use and are
aware of the tests’ limitations—emphasized repeatedly in
the codes of ethics of associations of assessment
professionals.
Assumption 5: Various Sources of Error are Part of the Assessment Process
Error In everyday conversation, we use the word error to refer to
mistakes, miscalculations, and the like.
Error refers to a long-standing assumption that factors
other than what a test attempts to measure will influence
performance on the test.
Error Variance The component of a test score attributable to sources
other than the trait or ability measured.
The element of variability in a score that is produced by
extraneous factors, such as measurement imprecision,
and is not attributable to the independent variable or other
controlled experimental manipulations.
Classical or True Score Theory It is a theory of testing based on the idea that a person's
observed or obtained score on a test is the sum of a true
score (error-free score) and an error score. Generally
speaking, the aim of classical test theory is to understand
and improve the reliability of psychological tests.
Assumption 6: Testing and Assessment Can Be Conducted in a Fair and Unbiased Manner
Today, all major test publishers strive to develop
instruments that are fair when used in strict accordance
with guidelines in the test manual.
However, despite the best efforts of many professionals,
fairness-related questions and problems do occasionally
arise. One source of fairness-related problems is the test
user who attempts to use a particular test with people
whose background and experience are different from the
background and experience of people for whom the test
was intended.
Assumption 7: Testing and Assessment Benefit Society
A world without tests would most likely be more a
nightmare than a dream.
Without tests or other assessment procedures, people
could present themselves as surgeons, bridge builders, or
airline pilots regardless of their background, ability, or
professional credentials on the basis of nepotism.
C. Norms
Norms Are the test performance data of a particular group of
testtakers that are designed for use as a reference when
evaluating or interpreting individual test scores.
Norm-referenced testing and A method of evaluation and a way of deriving meaning
assessment from test scores by evaluating an individual testtaker’s
score and comparing it to scores of a group of testtakers.
In this approach, the meaning of an individual test score is
understood relative to other scores on the same test. A
common goal of norm-referenced tests is to yield
information on a testtaker’s standing or ranking relative to
some comparison group of testtakers.
Types of Norms
Percentile Norms Are the raw data from a test’s standardization sample
converted to percentile form.
Percentile
Is an expression of the percentage of people whose
score on a test or measure falls below a particular raw
score.
Percentage Correct
Refers to the distribution of raw scores—more
specifically, to the number of items that were
answered correctly multiplied by 100 and divided by
the total number of items.
Age Norms Also known as age-equivalent scores
Indicate the average performance of different samples of
testtakers who were at various ages at the time the test
was administered.
Grade Norms Are developed by administering the test to representative
samples of children over a range of consecutive grade
levels (such as first through sixth grades).
Developmental Norms
A term applied broadly to norms developed on the basis
of any trait, ability, skill, or other characteristic that is
presumed to develop, deteriorate, or otherwise be
affected by chronological age, school grade, or stage of
life.
National Norms
Are derived from a normative sample that was nationally
representative of the population at the time the
norming study was conducted. In the fields of
psychology and education, for example, national norms
may be obtained by testing large numbers of people
representative of different variables of interest such as
age, gender, racial/ethnic background, socioeconomic
strata, geographical location (such as North, East, South,
West, Midwest), and different types of communities within
the various parts of the country (such as rural, urban,
suburban).
National Anchor Norms Provide the tool for such a comparison.
Just as an anchor provides some stability to a vessel, so
national anchor norms provide some stability to test
scores by anchoring them to other test scores.
Local Norms Provide normative information with respect to the local
population’s performance on some test.
For example a local company personnel director might find
some nationally standardized test useful in making
selection decisions but might deem the norms published in
the test manual to be far afield of local job applicants’
score distributions.
Subgroup Norms A normative sample can be segmented by any of the
criteria initially used in selecting subjects for the sample.
What results from such segmentation are more
narrowly defined subgroup norms.
Fixed Reference Group Scoring Systems
Fixed Reference Group Scoring The distribution of scores obtained on the test from one
Systems group of testtakers—referred to as the fixed reference
group—is used as the basis for the calculation of test
scores for future administrations of the test.
Perhaps the test most familiar to college students that
exemplifies the use of a fixed reference group scoring
system is the SAT.
Norm-Referenced versus Criterion-Referenced Evaluation
Criterion Standard on which a judgment or decision may be based.
Criterion-Referenced Testing and May be defined as a method of evaluation and a way of
Assessment deriving meaning from test scores by evaluating an
individual’s score with reference to a set standard.
Some examples:
To be eligible for a high-school diploma, students
must demonstrate at least a sixth-grade reading level.
To earn the privilege of driving an automobile, would-
be drivers must take a road test and demonstrate their
driving skill to the satisfaction of a state-appointed
examiner.
To be licensed as a psychologist, the applicant must
achieve a score that meets or exceeds the score
mandated by the state on the licensing test.
Norm Reference Test Refers to standardized tests that are designed to
compare and rank test takers in relation to one
another. Norm-referenced tests report whether test takers
performed better or worse than a hypothetical average
student, which is determined by comparing scores against
the performance results of a statistically selected group of
test takers, typically of the same age or grade level, who
have already taken the exam.
On a norm-referenced test, an individual’s percentile rank
is calculated according to the performance of their peers.
For example: Score is 40= 97th percentile
References:
Cohen & Swerdik (2009). Psychological Testing and Assessment: An Introduction to Tests and
Measurement 7th Edition.