0% found this document useful (0 votes)
112 views

Of Tests and Testing: Mcgraw-Hill/Irwin © 2013 Mcgraw-Hill Companies. All Rights Reserved

Test
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

Of Tests and Testing: Mcgraw-Hill/Irwin © 2013 Mcgraw-Hill Companies. All Rights Reserved

Test
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Chapter 4

Of Tests and Testing

McGraw-Hill/Irwin © 2013 McGraw-Hill Companies. All Rights Reserved.


Assumptions about Psychological Testing
Psychological Traits and States Exist
• A trait has been defined as “any distinguishable, relatively
enduring way in which one individual varies from another”
(Guilford, 1959, p. 6).
• States also distinguish one person from another but are
relatively less enduring (Chaplin et al., 1988).
• Thousands of trait terms can be found in the English
language (e.g. outgoing, shy, reliable, calm, etc.).
• Psychological traits exist as constructs - an informed,
scientific concept developed or constructed to describe or
explain behavior.
• We can’t see, hear, or touch constructs, but we can infer their
existence from overt behavior, such as test scores.
4-2
Assumptions about Psychological Testing

• Traits are relatively stable. They


may change over time, yet there
are often high correlations
between trait scores at different
time points.
• The nature of the situation
influences how traits will be
manifested.
• Traits refer to ways in which one
individual varies, or differs, from
another
Some people score higher than others
on traits like sensation-seeking
4-3
Assumptions about Psychological Testing

Traits and States Can Be Quantified and Measured

• Different test developers may define and measure


constructs in different ways.
• Once a construct is defined, test developers turn to
item content and item weighting.
• A scoring system and a way to interpret results need
to be devised.

4-4
Assumptions about Psychological Testing

Test-Related Behavior Predicts Non-Test-Related Behavior

Responses on tests are thought to predict real-world


behavior. The obtained sample of behavior is expected
to predict future behavior.

Tests Have Strengths and Weaknesses

Competent test users understand and appreciate the


limitations of the tests they use as well as how those
limitations might be compensated for by data from
other sources.
4-5
Assumptions about Psychological Testing
Various Sources of Error are Part of Assessment
Error refers to a long-standing assumption that factors
other than what a test attempts to measure will
influence performance on the test.

Error variance - the component of a test score


attributable to sources other than the trait or ability
measured.

• Both the assessee and assessor are sources of error


variance
4-6
Assumptions about Psychological Testing
Testing and Assessment can be Conducted in a Fair Manner
• All major test publishers strive to develop instruments that are
fair when used in strict accordance with guidelines in the test
manual.
• Problems arise if the test is used with people for whom it was
not intended.
• Some problems are more political than psychometric in
nature.

Testing and Assessment Benefit Society


• There is a great need for tests, especially good tests,
considering the many areas of our lives that they benefit.
4-7
What’s a “Good Test?”
Reliability: The consistency of the measuring tool: the
precision with which the test measures and the extent to
which error is present in measurements.

Validity: The test measures what it purports to measure.

Other considerations: Administration, scoring,


interpretation should be straightforward for trained
examiners. A good test is a useful test that will ultimately
benefit individual testtakers or society at large.
4-8
Norms
• Norm-referenced testing and assessment: a method of
evaluation and a way of deriving meaning from test
scores by evaluating an individual testtaker’s score and
comparing it to scores of a group of testtakers.
• The meaning of an individual test score is understood
relative to other scores on the same test.
• Norms are the test performance data of a particular group
of testtakers that are designed for use as a reference when
evaluating or interpreting individual test scores.
• A normative sample is the reference group to which test-
takers are compared.
4-9
Sampling to Develop Norms
Standardization: The process of administering a test to
a representative sample of testtakers for the
purpose of establishing norms.
Sampling – Test developers select a population, for
which the test is intended, that has at least one common,
observable characteristic.
Stratified sampling: Sampling that includes different
subgroups, or strata, from the population.
Stratified-random sampling: Every member of the
population has an equal opportunity of being included in
a sample.
4-10
Sampling to Develop Norms
Purposive sample: Arbitrarily selecting a sample that
is believed to be representative of the population.
Incidental/convenience sample: A sample that is
convenient or available for use. May not be
representative of the population.
• Generalization of findings from convenience
samples must be made with caution.

4-11
Sampling to Develop Norms
Developing Norms
Having obtained a sample test developers:
• Administer the test with standard set of instructions
• Recommend a setting for test administration
• Collect and analyze data
• Summarize data using descriptive statistics including
measures of central tendency and variability
• Provide a detailed description of the

4-12
Types of Norms
• Percentile - the percentage of people whose score
on a test or measure falls below a particular raw
score.
• Percentiles are a popular method for organizing
test-related data because they are easily calculated.
• One problem is that real differences between raw
scores may be minimized near the ends of the
distribution and exaggerated in the middle of the
distribution.

4-13
Types of Norms (cont’d.)
Age norms: average performance of different samples of test-
takers who were at various ages when the test was administered.
Grade norms: the average test performance of testtakers in a
given school grade.
National norms: derived from a normative sample that was
nationally representative of the population at the time the
norming study was conducted.
National anchor norms: An equivalency table for scores on two
different tests. Allows for a basis of comparison.
Subgroup norms: A normative sample can be segmented by any
of the criteria initially used in selecting subjects for the sample.
Local norms: provide normative information with respect to the
local population’s performance on some test.
4-14
Fixed Reference Group Scoring Systems
Fixed Reference Group Scoring Systems: The
distribution of scores obtained on the test from one group
of testtakers is used as the basis for the calculation of test
scores for future administrations of the test.
• The SAT employs this method.
Norm-Referenced versus Criterion-Referenced
Interpretation
Norm referenced tests involve comparing individuals to
the normative group. With criterion referenced tests
testtakers are evaluated as to whether they meet a set
standard (e.g. a driving exam).
4-15
Culture and Inference
• In selecting a test for use, responsible test users
should research the test’s available norms to check
how appropriate they are for use with the targeted
testtaker population.
• When interpreting test results it helps to know
about the culture and era of the test-taker.
• It is important to conduct culturally informed
assessment.

4-16
4-17

You might also like